Friday, April 2, 2021

Booksmart

 Reading between the lines:

“I’ve never read all these novels that are beautiful stories that have continued to have a resonance with people for so many generations, like beautiful works of art that I could read at any point. But instead, I choose not to read them and I just read the Internet. Constantly. And hear about who said a racial slur or look at a photo of what Ludacris did last weekend. Useless stuff. I read the Internet so much I feel like I’m on page a million of the worst book ever.” — Aziz Ansari 2015

had not been prioritizing reading in my adult life when I heard this interview. I was learning from other people, learning from traveling, learning from professional experiences, but I wasn’t devoting time to sit hours on end alone with the words that someone else had carefully put together — words that may have already resonated, or would resonate, with millions of other humans.

In the intervening years I’ve been able to devote more effort towards this timeless pursuit, aided by three productive bouts of unemployment, as well as a biologically-induced prolonged period of reduced socialization. Additionally, as a data scientist, I naturally kept data. I manually recorded the books I read, the date I finished them, the gender and nationality of the author, the year of publication and the number of pages. It would eventually morph into an entrepreneurial pursuit revisiting the basics of knowledge consumption. But first it allowed me to generate graphs like this:

You can see a growth in reading starting in 2016, that took a dip in 2017, but has remained strong especially since 2019.

Meta-awareness of my library guided my future book choices. I tried to keep a balance of female and male authors, fiction and non-fiction. I even sought out to increase the countries of origins of my authors, including Sierra Leone, Ukraine, Turkey and Brazil. I chose to expand into areas of limited exposure, to dive deeper into areas of passion, and occasionally to read for pleasure.

During the pandemic, I discovered Goodreads, the social book cataloguing website owned by Amazon. Much of the data that I was manually tracking was available on Goodreads, plus much more. Goodreads makes available the number of users who have added a book, plus the number that have marked it as read, as well as allows the crowd sourcing of genres. Goodreads is also remarkably comprehensive — I’ve been able to find just about every book that I’ve read on the site. For maybe the first time in history, we can calculate the readership of each book. By randomly sampling the Goodreads database, I discovered that that distribution is incredibly skewed — the median book had had only 21 readers!!

A ton of books have exactly 0 readers on Goodreads — there’s a lot of self-published crap out there! These include such titles as “Jesus Christ — the Master Psychologist”, “James Earl Jones Stress Away Coloring Book: An Adult Coloring Book Based on The Life of James Earl Jones”, “La bruja del amor y el yonqui del dinero”, “My Child Won’t Be Sh*t”, “Typewriter Hard Enamel Pin”, and the retrospectively ironic “What’s Freedom of the Press?” Even removing these 0 reader books, the median readership comes out to 65. Less surprisingly, the top bestsellers have had nearly 10 million people add them on Goodreads. A log scale is required to make the plot above human readable.

Nearly all of us will have a readership distribution extremely different from that graph. By definition, popular books are popular. Below I’ve plotted my books and sorted them in log 10 tiers.

Generally we find that the vast majority of popular books are fiction, but here I’d like to focus on the 0–1000 books. These are the obscure reads, the ones no one hears about from recommendation engines or book review articles. My discovering them are unique stories in their own right — many reflect my more esoteric interests. In fact, three of them were recommended from a single podcast called Sinica. A few highlights are below.

  • Wish Lanterns by Alec Ash (2017):
This book collates 9 lives of ordinary Chinese millennials, diving deep into their early adult lives. It gives personal stories that supplement the traditional narratives of China’s macro growth trends. Alec Ash was interviewed on Sinica and I figured it’d be a good read to further my understanding of China.
  • The First Filipino by Leon Guerrero (1962):
I’ve long been fascinated by the Filipino national hero Jose Rizal, mostly due to his polyglot and polymath abilities. Sharp readers will find his novel Noli Me Tángere also on that graph. This may be the authoritative biography, which credits him with creating the concept of a Filipino state.
  • Bastard Tongues by Derek Bickerton (2008):
I picked this up in a yard sale in DC in 2011. It’s about Creole languages around the world, and why so many of them are so similar. Bickerton rips to shred the prevailing theory, that these languages are similar due to contact, and proposes a new one that is still controversial in the academic community. His respect for Creole language speakers, many of who erroneously refer to their own speech as “broken”, is always apparent. This is an author I’d most like to grab a beer with.
  • East and West by Chris Patten (1998):
The memoir written by the last governor of Hong Kong shortly after Hong Kong was handed over has not aged particularly well, but that’s part of its value too. I learned that Hong Kong’s GDP was ¼ of all of China’s in 1997, and so many of Patten’s predictions have gone so wrong mainly because China has grown so rapidly.
  • The Autobiography of Alice B. Toklas by Gertrude Stein (1933):
I’ve also long been fascinated by the 1920’s expat scene in Paris, which Gertrude Stein herself coined the Lost Generation. The creativity and output from that group is legendary. I’d analogized that period to the expat community in Beijing in the early 2000s, and was both validated and miffed to hear others independently make the same comparison. From what I’d read about this generation (you’ll find The Sun Also Rises and A Moveable Feast up in that graph), I had high hopes for Stein, but I found her style impenetrable and the content quite dull. There are lots of mildly interesting anecdotes of famous artists like Picasso and Matisse, but I didn’t come out with any sense of who these people were. In fact, I left with the sense that maybe these people were famous because they were connected to other famous people.
  • Up to the Mountains and Down to the Countryside by Quincy Carroll (2015):
Quincy was two years ahead of me in high school, and so when I learned he had written a fiction book about living in China, I felt duty call. It happens to be one of the better books about China that I have read, encapsulating third tier city life and the expat experience in a way that non-fictions do not. He writes about language in a novel way, allowing spectrums of bilingual experience to become accessible to non-Chinese speakers. In addition to the subject matter that resonated hard with me, the writing quality equally reminded me of our shared English teachers.
  • The Transpacific Experiment by Matt Sheehan (2019):
This one shows up on the plot as [UNTITLED MATT SHEEHAN BOOK], which probably means the author needs a better publicist. In fact, Matt Sheehan is an ultimate player who lived in Xi’an and Beijing whom I played with and against on numerous occasions. An ultimate injury is the catalyst for this book, which forced him to pursue his China journalism career from Silicon Valley. Diving insightfully into China-US relations with regard to technology, academia and immigrant politics, Sheehan’s book even taught me to view my own family’s immigration story in a new light. He was also interviewed on Sinica about this book, which is deserving of a place in the shelf of anyone even casually interested in China. It is certainly deserving of a name.
  • Rare Earth Frontiers by Julie Klinger (2017):


    This was another deep, deep cut from the Sinica podcast. An academic text (I ordered it from the Cornell University Press), this book exemplifies the much misunderstood discipline of Geography. Klinger explores how rare earth (which is a misnomer) mining is more often influenced by domestic and international politics than by geological or economic factors. The regions where heavy, polluting mining is most exploited are typically borderlands far from the country’s metropoles, inhabited by minority ethnic groups.
  • American Notes for General Circulation by Charles Dickens (1842)
This is technically not under 1000 readers — somehow about 1300 people claim to have read this obscure travel journal of Dickens’ 1842 journey around the United States. It doesn’t make for light reading, but I sloshed through and found many gems ̶d̶e̶s̶c̶r̶i̶b̶i̶n̶g̶ judging cities that I know well — Boston, New York, DC, Cincinnati, St. Louis — but from a brand old historical perspective. Some quotes:

party feeling runs very high: the great constitutional feature of this institution being, that directly the acrimony of the last election is over, the acrimony of the next one begins;

— -

THE beautiful metropolis of America is by no means so clean a city as Boston, but many of its streets have the same characteristics; except that the houses are not quite so fresh-coloured, the sign- boards are not quite so gaudy, the gilded letters not quite so golden, the bricks not quite so red, the stone not quite so white, the blinds and area railings not quite so green, the knobs and plates upon the street doors not quite so bright and twinkling.

— -

Few people would live in Washington, I take it, who were not obliged to reside there;

— -

Pittsburg is like Birmingham in England; at least its townspeople say so. Setting aside the streets, the shops, the houses, waggons, factories, public buildings, and population, perhaps it may be.

My takeaway from all this is that popularity is not very well correlated with quality — most of these unknown books are awesome. A book’s popularity is likely more correlated with how well-connected the author is, how accessible its subject is, and how well it is publicized. Additionally, I will describe at the end of this post some less obscure books that gave me great joy.

Goodreads allows you to export your books data, which includes useless information such as the binding and condition description, and does not include the crowd sourced shelf genres and number of total readers. Though Goodreads has also disappointingly disabled their developer API, I wrote code to scrape this information. I began soliciting friends to send me their exports, allowing the generation of plots like this one below.

Plotting out the distribution of read genres among multiple individuals taught me a lot about what I’m reading, what my friends are reading, and what we are not reading. Most conspicuous for me are the genres that are blank — Philosophy, Sequential Art (Graphic novels or comic books), Art, Feminism, Picture Books, Social Movements and True Crime. If you had straight up asked me, “Cal, do you read books about art?” I would have known no, I do not. But had I not seen this, I would not even have considered reading books about art to be a thing. To me, this is the real value of a crowd-sourced database like Goodreads, but not the direction that Amazon has been taking it. While it is primarily used to recommend new books similar to what you have enjoyed before, it can be used to show areas of literature that have been completely outside your purview.

Additional books I love:

  • The World of Yesterday by Stefan Zweig (1942):

I was on vacation in Salzburg when I walked down a street named Stefan Zweig Way. Who is this person worth naming a street over I asked Wikipedia. It’s a rare person whose life is of such acclaim that a Wikipedia article written 70 years after their death makes clear their works are worth reading. I picked up his final book and couldn’t put it down. Written in 1942, the book is half memoir, half history as Zweig recounts the final decades of the Austria-Hungarian empire, World War I and the rise of Hitler through his eyes as a Jew born in Vienna. Multilingual and famous in his time, Zweig lived all over Europe and embodied a cosmopolitan, passport-less Europe that vanished in his lifetime. Chock-full of great anecdotes, Zweig’s writing is incredible even in translation. The sadness he feels over this loss is present throughout the narrative. With his partner, he fled Europe, eventually ending up in Petropolis, Brazil, where they finished this manuscript, sent it off to the publishers, then committed suicide the next day via barbiturates.

  • Why We’re Polarized by Ezra Klein (2019):

Recommended by Amazon via their Great on Kindle program, this book has fundamentally changed the way I absorb news. Vox founder Ezra Klein persuasively demonstrates that our politics has become far more polarized throughout the past decade, why this was useful but is dangerous, and why our media consumption has created a feedback loop that demands more polarized media. Even though I have lived through much of the period of polarization, I hadn’t quite realized how many assumptions have fundamentally changed, as we have mainly gone from a society where our views dictated our political affiliation, to one where our political affiliation dictates our views. Serotonin and social media have allowed us to selectively choose stories with positive news about people we like and negative news about people we hate, leading to a population with polarized views of the truth. Of all the books I’ve listed so far, this is the most must-read of them all.

  • Happiness by Aminatta Forna:

I read this book as part of a Georgetown Alumni book club. Forna, who is Scottish and Sierra Leonean, is currently a visiting Professor at Georgetown. While reading this novel, I couldn’t believe the degree of difficulty it must’ve taken to write it —while setting a love story and family drama amidst the backdrop of West African immigrant life in London, Forna had to become an expert researcher in coyotes, foxes and trauma psychiatry. Quotes include this one introducing a main character:

‘Your parents named you after Attila the Hun?’ Attila smiled. ‘Some people,’ he said, ‘name their baby girls Victoria.’

And hits hard with gems like this:

Suffering had become a spectacle that served not to warn of the vagaries of misfortune but to remind the audience, sitting in warmth and comfort, of their own good fortune.

— —

I’m looking to expand on this idea, to demonstrate how literary meta-awareness can help guide us as we navigate life. I’m looking for more “users” — people to voluntarily export their Goodreads data and tell me if/how they find these analyses useful. I’m also looking for collaborators, who see value in what I’ve done and can offer further direction.

For more graphs and code, my Github link is below: https://github.com/cal65/Reading-History/tree/master/Books