Information theory meets power laws : stochastic processes and language models /: stochastic processes and language models. (2020)
- Record Type:
- Book
- Title:
- Information theory meets power laws : stochastic processes and language models /: stochastic processes and language models. (2020)
- Main Title:
- Information theory meets power laws : stochastic processes and language models
- Further Information:
- Note: Łukasz Dębowski.
- Authors:
- Dębowski, Łukasz Jerzy, 1975-
- Contents:
- Preface xi Acknowledgments xiv Basic notations 1 1 Guiding ideas 3 1.1 The motivating question 3 1.2 Further questions about texts 7 1.3 Zipf’s and Herdan’s laws 10 1.4 Markov and finite-state processes 17 1.5 More general stochastic processes 23 1.6 Two interpretations of probability 26 1.7 Insights from information theory 28 1.8 Estimation of entropy rate 31 1.9 Entropy of natural language 34 1.10 Algorithmic information theory 38 1.11 Descriptions of a random world 41 1.12 Facts and words related 47 1.13 Repetitions and entropies 52 1.14 Decay of correlations 57 1.15 Recapitulation 59 2 Probabilistic preliminaries 63 2.1 Probability measures 65 2.2 Product measurable spaces 68 2.3 Discrete random variables 71 2.4 From IID to finite-state processes 74 Problems 79 3 Probabilistic toolbox 81 3.1 Borel -fields and a fair coin 83 3.2 Integral and expectation 87 3.3 Inequalities and corollaries 91 3.4 Semi-distributions 96 3.5 Conditional probability 99 3.6 Modes of convergence 106 3.7 Complete spaces 107 Problems 110 4 Ergodic properties 113 4.1 Plain relative frequency 115 4.2 Birkhoff ergodic theorem 120 4.3 Ergodic and mixing criteria 124 4.4 Ergodic decomposition 129 Problems 132 5 Entropy and information 135 5.1 Shannon measures for partitions 137 5.2 Block entropy and its limits 143 5.3 Shannon measures for fields 150 5.4 Block entropy limits revisited 160 5.5 Convergence of entropy 164 5.6 Entropy as self-information 166 Problems 169 6 Equipartition and universality 173Preface xi Acknowledgments xiv Basic notations 1 1 Guiding ideas 3 1.1 The motivating question 3 1.2 Further questions about texts 7 1.3 Zipf’s and Herdan’s laws 10 1.4 Markov and finite-state processes 17 1.5 More general stochastic processes 23 1.6 Two interpretations of probability 26 1.7 Insights from information theory 28 1.8 Estimation of entropy rate 31 1.9 Entropy of natural language 34 1.10 Algorithmic information theory 38 1.11 Descriptions of a random world 41 1.12 Facts and words related 47 1.13 Repetitions and entropies 52 1.14 Decay of correlations 57 1.15 Recapitulation 59 2 Probabilistic preliminaries 63 2.1 Probability measures 65 2.2 Product measurable spaces 68 2.3 Discrete random variables 71 2.4 From IID to finite-state processes 74 Problems 79 3 Probabilistic toolbox 81 3.1 Borel -fields and a fair coin 83 3.2 Integral and expectation 87 3.3 Inequalities and corollaries 91 3.4 Semi-distributions 96 3.5 Conditional probability 99 3.6 Modes of convergence 106 3.7 Complete spaces 107 Problems 110 4 Ergodic properties 113 4.1 Plain relative frequency 115 4.2 Birkhoff ergodic theorem 120 4.3 Ergodic and mixing criteria 124 4.4 Ergodic decomposition 129 Problems 132 5 Entropy and information 135 5.1 Shannon measures for partitions 137 5.2 Block entropy and its limits 143 5.3 Shannon measures for fields 150 5.4 Block entropy limits revisited 160 5.5 Convergence of entropy 164 5.6 Entropy as self-information 166 Problems 169 6 Equipartition and universality 173 6.1 SMB theorem 175 6.2 Universal semi-distributions 177 6.3 PPM probability 178 6.4 SMB theorem revisited 184 6.5 PPM-based statistics 186 Problems 193 7 Coding and computation 195 7.1 Elements of coding 197 7.2 Kolmogorov complexity 203 7.3 Algorithmic coding theorems 213 7.4 Limits of mathematics 222 7.5 Algorithmic randomness 226 Problems 232 8 Power laws for information 237 8.1 Hilberg exponents 239 8.2 Second order SMB theorem 246 8.3 Probabilistic and algorithmic facts 250 8.4 Theorems about facts and words 256 Problems 263 9 Power laws for repetitions 267 9.1 R´enyi-Arimoto entropies 269 9.2 Generalized entropy rates 274 9.3 Recurrence times 277 9.4 Subword complexity 281 9.5 Two maximal lengths 288 9.6 Logarithmic power laws 293 Problems 299 10 AMS processes 301 10.1 AMS and pseudo-AMS measures 302 10.2 Quas i-periodic coding 305 10.3 Synchronizable coding 308 10.4 Entropy rate in the AMS case 310 Problems 314 11 Toy examples 317 11.1 Finite and ultra-finite energy 319 11.2 Santa Fe processes and alike 325 11.3 Encoding into a finite alphabet 334 11.4 Random hierarchical association 346 11.5 Towards better models 357 Problems 360 Future research 361 Index 363 Bibliography 366 … (more)
- Edition:
- 1st
- Publisher Details:
- Hoboken : John Wiley & Sons, Inc
- Publication Date:
- 2020
- Extent:
- 1 online resource
- Subjects:
- 410.15195
Computational linguistics
Stochastic processes - Languages:
- English
- ISBNs:
- 9781119625377
9781119625360 - Related ISBNs:
- 9781119625278
- Notes:
- Note: Includes bibliographical references and index.
Note: Description based on CIP data; resource not viewed. - Access Rights:
- Legal Deposit; Only available on premises controlled by the deposit library and to one user at any one time; The Legal Deposit Libraries (Non-Print Works) Regulations (UK).
- Access Usage:
- Restricted: Printing from this resource is governed by The Legal Deposit Libraries (Non-Print Works) Regulations (UK) and UK copyright law currently in force.
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD.DS.577109
- Ingest File:
- 03_217.xml