Charts
Stats at 50
2Now that the database officially has 50 books posted, I thought it would be fun to throw together some stats to analyze my reading habits for the last 18 months. I am also using this post as an opportunity to add the new Charts page to the wordpress blog. It’s basically just the old charts page, but edited and formatted for the main blog.
Note: all the stats below, as well as those on the charts page, are dynamically updated from the database, so regardless of when you read this, know that the information is up to date.
One of the first things that jumps to my mind when I think about my habits since December of 2008 is that I’ve been reading non-stop. But how much have I really been reading? Fortunately, I made an agreement with myself not to read two books at the same time, so by determining the number of days in each month that fall outside of the start and stop dates of a particular read, we can calculate this. The table below shows the breakdown, per month, since I started the book database.
| Month | Year | DaysReading | DaysInMonth | PercentReading |
|---|---|---|---|---|
| 12 | 2008 | 20 | 31 | 64.5161 |
| 1 | 2009 | 31 | 31 | 100 |
| 2 | 2009 | 28 | 28 | 100 |
| 3 | 2009 | 31 | 31 | 100 |
| 4 | 2009 | 26 | 30 | 86.6667 |
| 5 | 2009 | 31 | 31 | 100 |
| 6 | 2009 | 25 | 30 | 83.3333 |
| 7 | 2009 | 23 | 31 | 74.1935 |
| 8 | 2009 | 26 | 31 | 83.8710 |
| 9 | 2009 | 26 | 30 | 86.6667 |
| 10 | 2009 | 21 | 31 | 67.7419 |
| 11 | 2009 | 30 | 30 | 100 |
| 12 | 2009 | 24 | 31 | 77.4194 |
| 1 | 2010 | 18 | 31 | 58.0645 |
| 2 | 2010 | 28 | 28 | 100.0000 |
| 3 | 2010 | 22 | 31 | 70.9677 |
| 4 | 2010 | 26 | 30 | 86.6667 |
| 5 | 2010 | 26 | 31 | 83.8710 |
| 6 | 2010 | 30 | 30 | 100 |
| 7 | 2010 | 26 | 31 | 83.8710 |
| 8 | 2010 | 31 | 31 | 100 |
| 9 | 2010 | 21 | 30 | 70.0000 |
| 10 | 2010 | 31 | 31 | 100.0000 |
| 11 | 2010 | 28 | 30 | 93.3333 |
| 12 | 2010 | 7 | 31 | 22.5806 |
| 1 | 2011 | 8 | 31 | 25.8065 |
| 2 | 2011 | 13 | 28 | 46.4286 |
| 4 | 2011 | 22 | 30 | 73.3333 |
| 5 | 2011 | 15 | 31 | 48.3871 |
| 6 | 2011 | 28 | 30 | 93.3333 |
| 7 | 2011 | 15 | 31 | 48.3871 |
| 8 | 2011 | 10 | 31 | 32.2581 |
| 9 | 2011 | 29 | 30 | 96.6667 |
| 10 | 2011 | 18 | 31 | 58.0645 |
| 11 | 2011 | 30 | 30 | 100 |
| 12 | 2011 | 29 | 31 | 93.5484 |
| 1 | 2012 | 21 | 31 | 67.7419 |
As you can see I do spend a lot of time reading, but in total, I actually only spent 77.6199% of the available days actively reading a book. Or, to put it another way, of the 1126 total days available to read, I was not reading during 252 of them. Now it suddenly seems like I could be spending even more time reading than I already am. In fact, that’s like 8.2895 months of additional reading!
Well, I hope you enjoyed that little bit of nerdiness. I’d post more, but my head feels like it’s going to explode. Still, I’m always open for suggestions on additional stats, so let me know if you have any.
Chart: Measuring Genre Transitions
4On my last chart post, I asked if there were any suggestions for other data graphics and Shawn mentioned the idea of showing the transitions between genres. In other words, do I typically read Fantasy or Sci-Fi after Juvenile Fiction?
In order to present this graphically, I thought a step line graph might work best. You will notice that each “step” is a single instance of a book being read. There are scenarios where two books of the same genre are read in a row, but the first Sci-Fi instance should give you an idea of the size of a single step.
I also ordered it such that genres were near those that are most similar. Thus, more wildly different genres would be further apart, and would be represented by longer steps in the chart.
As you can see, there are not any obvious patterns that emerge, but that in and of itself is interesting. It appears that I am pretty varied in my choice of genres and the order in which I read them. There are some interesting things, though, such as the fact that I always read Fiction after Nonfiction, and seem to enjoy going from something weird (sci-fi or fantasy) to something normal (fiction/non fiction).
However, it should be clear that there are some flaws in presenting the data this way. For example, this does not consider the degrees to which a book could deal with weird subject matter in fiction, or relatively normal subject matter in sci-fi or fantasy. This is especially problematic with a genre like juvenile fiction which can, in terms of content, be pretty much any genre.
Fortunately, I have been attempting to quantify certain aspects of books to make comparisons easier and more objective. The two categories that are most relevant to this discussion are realism (which measures to extent to which the book deviates from accepted scientific facts) and world (the extent to which the world is unlike the real world in which we live). Both categories are out of 10, with 10 being the most unrealistic and the world must unlike our own. Thus, by combining these two numbers we get what I am calling the geek quotient. This allows us to consider more fully the differences between books, regardless of genre. So, for example, both a nonfiction book, and a fiction book that is entirely realistic and set in our world, will register a geek quotient of 0. This is reasonable since the transition between such books is relatively natural.
Again, there are no obvious patterns (unless you see some I don’t?) but this graph is far more accurate than the last one. We see more zeroes, since both fiction and nonfiction can fill that role, and some more extreme transitions as we move from 0 to the high teens and back to 0. It also is interesting to see how much time is spent in the middle and bottom of the geek quotient, despite my geeky tendencies. In fact, if you take the time to observe this chart upside down (or note the “negative space” above the line) I pretty evenly split my time between geeky and non geeky reading.
So do you think the second or first chart is better? Are there any patterns that I’m overlooking? Any suggestions for improving the information these graphics reveal?
Chart: Pages per Day by Genre
3One of the advantages of having a book database is that you can easily query statistical data in order to look at reading habits in a different way. It also allows for precision that would be extremely tedious otherwise.
After reading The Visual Display of Quantitative Information I realized that my existing selection of charts are, while not worthless, not as informative or useful as I would like them to be. Many of them have chart junk, they rely too heavily on pie charts, they are not focused on data-ink maximization and in some cases the information could be just as easily displayed and understood in a table. (I still think the charts have worth, so I want to keep them up, but I am also open to any suggestions for improvement.)
So, with that realization, I decided to try to make something that would be more interesting, offer more insight, and conform to the rules of building good data graphics that Visual Display lays out for us. Below is my first attempt. It shows, by genre, the average number of pages read for each entry in the database. Yellow diamonds are books I do not recommend, while blue squares are books that I do. The objects are slightly transparent to give a sense of density for when multiple books have a similar pages per day average.
Simple, yes, but I think there are some interesting things to draw from it. First of all, different genres lend themselves naturally to being read at different speeds (which is why this is broken out by genre). Juvenile fiction you can fly through compared to everything else on the list (except for the one comedy book I’ve read. Which may be an outlier, but it’s hard to say without reading other books in the genre). Also, though it is possible to read a book slowly that you are enjoying (and would recommend to others) it makes sense that we see ‘not recommended’ volumes falling near the bottom of the average pages per day for their respective genres.
Of course, since this is my database and my books, I could skew this to confirm or deny the theory that you read books you enjoy faster, but what’s the point? Why would I want to read a book I hate any faster than I already am? Or stop reading a book I love just to lower the pages per day? My reading time is limited as is, and I doubt I’m going to read more/less quickly just to alter my stats. Plus, this chart updates automatically with each new entry, so as the sample size increases, any outliers should become more obvious and easier to dismiss.
Anyone have any other ideas for what would make interesting graphical analysis when it comes to trends or tendencies when reading books?