Data simplification : taming information with open source tools /: taming information with open source tools. (2016)
- Record Type:
- Book
- Title:
- Data simplification : taming information with open source tools /: taming information with open source tools. (2016)
- Main Title:
- Data simplification : taming information with open source tools
- Further Information:
- Note: Jules J. Berman.
- Authors:
- Berman, Jules J
- Contents:
- Chapter 1. The Simple Life Section 1.1. Simplification drives scientific progress Section 1.2. The human mind is a simplifying machine Section 1.3. Simplification in Nature Section 1.4. The Complexity Barrier Section 1.5. Getting ready Open Source Tools for Chapter 1 Perl Python Ruby Text Editors OpenOffice Command line utilities Cygwin, Linux emulation for Windows DOS batch scripts Linux bash scripts Interactive line interpreters Package installers System calls References for Chapter 1 Glossary for Chapter 1 Chapter 2. Structuring Text Section 2.1. The Meaninglessness of free text Section 2.2. Sorting text, the impossible dream Section 2.3. Sentence Parsing Section 2.4. Abbreviations Section 2.5. Annotation and the simple science of metadata Section 2.6. Specifications Good, Standards Bad Open Source Tools for Chapter 2 ASCII Regular expressions Format commands Converting non-printable files to plain-text Dublin Core References for Chapter 2 Glossary for Chapter 2 Chapter 3. Indexing Text Section 3.1. How Data Scientists Use Indexes Section 3.2. Concordances and Indexed Lists Section 3.3. Term Extraction and Simple Indexes Section 3.4. Autoencoding and Indexing with Nomenclatures Section 3.5. Computational Operations on Indexes Open Source Tools for Chapter 3 Word lists Doublet lists Ngram lists References for Chapter 3 Glossary for Chapter 3 Chapter 4. Understanding Your Data Section 4.1. Ranges and Outliers Section 4.2. Simple Statistical Descriptors Section 4.3.Chapter 1. The Simple Life Section 1.1. Simplification drives scientific progress Section 1.2. The human mind is a simplifying machine Section 1.3. Simplification in Nature Section 1.4. The Complexity Barrier Section 1.5. Getting ready Open Source Tools for Chapter 1 Perl Python Ruby Text Editors OpenOffice Command line utilities Cygwin, Linux emulation for Windows DOS batch scripts Linux bash scripts Interactive line interpreters Package installers System calls References for Chapter 1 Glossary for Chapter 1 Chapter 2. Structuring Text Section 2.1. The Meaninglessness of free text Section 2.2. Sorting text, the impossible dream Section 2.3. Sentence Parsing Section 2.4. Abbreviations Section 2.5. Annotation and the simple science of metadata Section 2.6. Specifications Good, Standards Bad Open Source Tools for Chapter 2 ASCII Regular expressions Format commands Converting non-printable files to plain-text Dublin Core References for Chapter 2 Glossary for Chapter 2 Chapter 3. Indexing Text Section 3.1. How Data Scientists Use Indexes Section 3.2. Concordances and Indexed Lists Section 3.3. Term Extraction and Simple Indexes Section 3.4. Autoencoding and Indexing with Nomenclatures Section 3.5. Computational Operations on Indexes Open Source Tools for Chapter 3 Word lists Doublet lists Ngram lists References for Chapter 3 Glossary for Chapter 3 Chapter 4. Understanding Your Data Section 4.1. Ranges and Outliers Section 4.2. Simple Statistical Descriptors Section 4.3. Retrieving Image Information Section 4.4. Data Profiling Section 4.5. Reducing data Open Source Tools for Chapter 4 Gnuplot MatPlotLib R, for statistical programming Numpy Scipy ImageMagick Displaying equations in LaTex Normalized compression distance Pearson's correlation The ridiculously simple dot product References for Chapter 4 Glossary for Chapter 4 Chapter 5. Identifying and Deidentifying Data Section 5.1. Unique Identifiers Section 5.2. Poor Identifiers, Horrific Consequences Section 5.3. Deidentifiers and Reidentifiers Section 5.4. Data Scrubbing Section 5.5. Data Encryption and Authentication Section 5.6. Timestamps, Signatures, and Event Identifiers Open Source Tools for Chapter 5 Pseudorandom number generators UUID Encryption and decryption with OpenSSL One-way hash implementations Steganography References for Chapter 5 Glossary for Chapter 5 Chapter 6. Giving Meaning to Data Section 6.1. Meaning and Triples Section 6.2. Driving Down Complexity with Classifications Section 6.3. Driving Up Complexity with Ontologies Section 6.4. The unreasonable effectiveness of classifications Section 6.5. Properties that Cross Multiple Classes Open Source Tools for Chapter 6 Syntax for triples RDF Schema RDF parsers Visualizing class relationships References for Chapter 6 Glossary for Chapter 6 Chapter 7. Object-oriented data Section 7.1. The Importance of Self-explaining Data Section 7.2. Introspection and Reflection Section 7.3. Object-Oriented Data Objects Section 7.4. Working with Object-Oriented Data Open Source Tools for Chapter 7 Persistent data Persistence is the ability of data to outlive the program that produced it. SQLite databases References for Chapter 7 Glossary for Chapter 7 Chapter 8. Problem simplification Section 8.1. Random numbers Section 8.2. Monte Carlo Simulations Section 8.3. Resampling and Permutating Section 8.4. Verification, Validation, and Reanalysis Section 8.5. Data Permanence and Data Immutability Open Source Tools for Chapter 8 Burrows Wheeler transform Winnowing and chaffing References for Chapter 8 Glossary for Chapter 8 … (more)
- Edition:
- 1st edition
- Publisher Details:
- Amsterdam : Morgan Kaufmann
- Publication Date:
- 2016
- Extent:
- 1 online resource
- Subjects:
- 005.7
Database management
Data structures (Computer science) - Languages:
- English
- ISBNs:
- 9780128038543
- Related ISBNs:
- 9780128037812
- Notes:
- Note: Description based on CIP data; item not viewed.
- Access Rights:
- Legal Deposit; Only available on premises controlled by the deposit library and to one user at any one time; The Legal Deposit Libraries (Non-Print Works) Regulations (UK).
- Access Usage:
- Restricted: Printing from this resource is governed by The Legal Deposit Libraries (Non-Print Works) Regulations (UK) and UK copyright law currently in force.
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD.DS.54063
- Ingest File:
- 02_041.xml