Case Study: Trends from Amazon Bestsellers¶

Introduction¶

The following is a case study investigating bestelling books on Amazon between the years 2009 and 2019. We will look at how authors/publishers can use the trends in this data to set long-term goals.

Contents¶

  • Summary
    • Tools and Techniques
    • Recommendations
  • Guiding Questions
  • Prepare Data
  • Process Data
  • Analysis
  • References
  • Contact

Summary¶

It's no secret that Amazon has become a behemoth of an organization, but it started out by starting selling books. And by the end of the time period for this data set (2019), Amazon enjoys a 50% share of all book distribution and an eBook market share around 67% (source). For writers and publishers around the world, it's hard to ignore Amazon's success and it seems unwise to consider not publishing to the platform.

This case study will investgiate the top 50 bestselling authors and books over the ten year period from 2009-2019. The goal is to provide insights so writers and publishers can make data-informed decisions about long-term goals.

Guiding Questions¶

For this case study, our main guiding question is:

How can trends from the Top 50 Amazon Bestsellers over a 10-year period inform long-term goals for authors and publishers?

To address this question, we will be exploring the answering the following sub questions:

  • What author(s) published the most between 2009-2019?
  • What percentage of books were fiction vs. non-fiction?
  • What book appears the most between 2009-2019?

Tools and Techniques¶

This analysis made use of the following tools and techniques:

  • Python programming language and libraries; pandas, seaborn, numpy, and matplotlib
  • Data transformations: extraction, visualizations, summary statistics
  • Data inspection: removal of duplicate/unnessary data, change format/datatype, verify unique values

Recommendations¶

The analysis yielded the following key observations:

  • Most authors (75%) only published twice during this 10-year period
  • A majority (73%) of the expensive books (books priced greater than 16 USD) were non-fiction
  • most books (75%) published during this 10-year period were under 16 USD and 50% of them were priced between 7 USD and 16 USD

These observations led to the following recommendations outlined below.

Focus on high quality content¶

Most authors (75%) only published twice during this 10-year period. It takes a lot to publish a book -- drafting, writing, editing, and revising. This would seem to indicate that for those authors who made it into the Top 50 bestsellers list their book was most likely well-received because it was well-written.

For writers/publishers, seeking to make it on these lists in the next ten years this means focusing on writing less and writing better. By slowing down to focus on higher quality content it increases your chance of your book being received well and popular.

For higher priced books, focus on non-fiction¶

A majority (73%) of the expensive books (books priced greater than 16 USD) were non-fiction. Being that non-fiction books generally require more research and effort to write, it make sense that they would be priced higher to reflect that. For authors/publishers seeking earn more, they might consider shifting their output to non-fiction books. This could be more lucrative if combined with other sources of incomes such speaking engagements, workshops, courses, etc.

Set price between 7 and 16 USD¶

This is likely to change due to inflation and increase production costs, but it should be noted that most books (75%) published during this 10-year period were under 16 USD and 50% of them were priced in this range. In other words, when choosing the book price ensure that it is within the current range of book prices -- not too high, but not too low.

Guiding Questions¶

For this case study, our main guiding question is:

How can trends from the Top 50 Amazon Bestsellers over a 10-year period inform long-term goals for authors and publishers?

To address this question, we will be exploring the answering the following sub questions:

  • What author(s) published the most between 2009-2019?
  • What percentage of books were fiction vs. non-fiction?
  • What book appears the most between 2009-2019?

Prepare Data¶

Dataset¶

The data we'll be working with is the Amazon Top 50 Bestselling Books 2009 - 2019 provided by Scooter Saalu on Kaggle.

Description¶

The dataset contains Amazon's Top 50 bestselling books from 2009 to 2019. There are 550 books with the data being categorized into fiction and non-fiction using Goodreads.

The data contains the following columns:

  • Name: Name of the book (i.e., book title)
  • Author: Name of the person/organization who wrote/published the book
  • User rating: Average user rating on a scale of 1 to 5 in a given year
  • Reviews: The total number of reviews the book received in a given
  • Year: The year the book appeared on the bestseller list
  • Price: The list price of the book in a given year

License¶

The data is made available to use via the CC0: Public Domain license which allows anyone to "copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission".

Process Data¶

Let's import our libraries and start processing our data. We will look for the following abnormalities:

  • Check for missing values
  • Investigate outliers
  • Confirm data types
In [ ]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

Quick overview¶

In [ ]:
df = pd.read_csv('amazon_bestsellers_2009-2019.csv')
In [ ]:
df.shape
Out[ ]:
(550, 7)
In [ ]:
df.head()
Out[ ]:
Name Author User Rating Reviews Price Year Genre
0 10-Day Green Smoothie Cleanse JJ Smith 4.7 17350 8 2016 Non Fiction
1 11/22/63: A Novel Stephen King 4.6 2052 22 2011 Fiction
2 12 Rules for Life: An Antidote to Chaos Jordan B. Peterson 4.7 18979 15 2018 Non Fiction
3 1984 (Signet Classics) George Orwell 4.7 21424 6 2017 Fiction
4 5,000 Awesome Facts (About Everything!) (Natio... National Geographic Kids 4.8 7665 12 2019 Non Fiction
In [ ]:
df.describe()
Out[ ]:
User Rating Reviews Price Year
count 550.000000 550.000000 550.000000 550.000000
mean 4.618364 11953.281818 13.100000 2014.000000
std 0.226980 11731.132017 10.842262 3.165156
min 3.300000 37.000000 0.000000 2009.000000
25% 4.500000 4058.000000 7.000000 2011.000000
50% 4.700000 8580.000000 11.000000 2014.000000
75% 4.800000 17253.250000 16.000000 2017.000000
max 4.900000 87841.000000 105.000000 2019.000000

Check for missing values¶

In [ ]:
df.isnull().sum()
Out[ ]:
Name           0
Author         0
User Rating    0
Reviews        0
Price          0
Year           0
Genre          0
dtype: int64

Observations

No missing values found.

Investigate outliers¶

In [ ]:
columns_to_plot = ['User Rating', 'Reviews', 'Price', 'Year']
for column_name in columns_to_plot:
    plt.boxplot(df[column_name])
    plt.title('Box Plot:' + column_name)
    plt.xlabel(column_name)
    plt.ylabel('Value')
    plt.show()
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Observations

Price, User Ratings, and Reviews have outliers. However, this is to be expected since there would be variablity in book prices as well as the number of ratings and reviews a book receives. Further, the outliers presented in the data do not seem unreasonable given the context of the data.

Confirm data types¶

In [ ]:
df.dtypes
Out[ ]:
Name            object
Author          object
User Rating    float64
Reviews          int64
Price            int64
Year             int64
Genre           object
dtype: object

Observations

No abnormal data types.

Process Conlcusions¶

  • No missing values; contains 550 entries (as described in the source)
  • The max and min of the year matches what is the data description
  • The data types for each column match what is described (i.e., numbers are numbers, categorical data are objects/strings)

Analysis¶

Correlation Heatmap¶

In [ ]:
correlation_matrix = df[['User Rating', 'Price', 'Reviews', 'Year']].corr()

plt.figure(figsize=(10,8))

sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')

plt.title('Correlation Heatmap')

plt.show()
No description has been provided for this image

Observations

Little or no correllation between User Rating, Price, and Reviews.

Genre¶

In [ ]:
genre = df["Genre"].value_counts()
print(genre)

percent_non_fic = round(((genre['Non Fiction'] / 550) * 100))
percent_fic = round(((genre['Fiction'] / 550) * 100))

sizes = [percent_non_fic, percent_fic]
labels = ['Non Fiction', 'Fiction']

plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=140)
plt.axis('equal')
plt.title('Genre')
plt.show()
Non Fiction    310
Fiction        240
Name: Genre, dtype: int64
No description has been provided for this image
In [ ]:
sns.countplot(x='Genre', data=df)
 
plt.xticks(rotation=-45)

plt.show()
No description has been provided for this image

Observations

  • Most books are non-fiction, but only by a slight majority
    • Non Fiction: 56%
    • Fiction: 44%

Price¶

Summary Statistics¶

In [ ]:
df["Price"].describe()
Out[ ]:
count    550.000000
mean      13.100000
std       10.842262
min        0.000000
25%        7.000000
50%       11.000000
75%       16.000000
max      105.000000
Name: Price, dtype: float64

Box plot¶

In [ ]:
plt.boxplot(df['Price'])
plt.title('Box Plot of Price Values')
plt.xlabel('Price')
plt.ylabel('Value')
plt.yticks(range(0, 110, 5))
plt.show()
No description has been provided for this image

Histogram¶

In [ ]:
plt.figure(figsize=(8,6), dpi=80)
sns.histplot(data=df["Price"], bins='auto')

plt.title('Amazon Bestsellers 2009-2019: Price')
plt.xlabel('Price')
plt.ylabel('Frequency')

plt.show()
No description has been provided for this image

Observations

  • 50% of the price values for books are between 7 USD and 16 USD.
  • Outliers are greater than 30 USD

Expensive Books¶

For our purposes we will define a book as expensive it costs more than 16 USD since 75% of books in our data set cost less than that.

In [ ]:
expensive_books_sorted = df.loc[df['Price']>16].sort_values(by='Price', ascending=False)

num_expensive_books = len(expensive_books_sorted)

print(num_expensive_books, 'books sold for more than 16 USD')
expensive_books_sorted
122 books sold for more than 16 USD
Out[ ]:
Name Author User Rating Reviews Price Year Genre
69 Diagnostic and Statistical Manual of Mental Di... American Psychiatric Association 4.5 6679 105 2013 Non Fiction
70 Diagnostic and Statistical Manual of Mental Di... American Psychiatric Association 4.5 6679 105 2014 Non Fiction
473 The Twilight Saga Collection Stephenie Meyer 4.7 3801 82 2009 Fiction
151 Hamilton: The Revolution Lin-Manuel Miranda 4.9 5867 54 2016 Non Fiction
346 The Book of Basketball: The NBA According to T... Bill Simmons 4.7 858 53 2009 Non Fiction
... ... ... ... ... ... ... ...
311 StrengthsFinder 2.0 Gallup 4.0 5069 17 2016 Non Fiction
446 The Pioneer Woman Cooks: A Year of Holidays: 1... Ree Drummond 4.8 2663 17 2013 Non Fiction
312 StrengthsFinder 2.0 Gallup 4.0 5069 17 2017 Non Fiction
342 The Big Short: Inside the Doomsday Machine Michael Lewis 4.7 3536 17 2010 Non Fiction
307 StrengthsFinder 2.0 Gallup 4.0 5069 17 2012 Non Fiction

122 rows × 7 columns

Summary Statistics¶
In [ ]:
expensive_books_sorted['Price'].describe()
Out[ ]:
count    122.000000
mean      27.172131
std       14.925541
min       17.000000
25%       18.000000
50%       21.000000
75%       29.500000
max      105.000000
Name: Price, dtype: float64
Box plot¶
In [ ]:
plt.boxplot(expensive_books_sorted['Price'])
plt.title('Expensive Books (>$16)')
plt.xlabel('Price')
plt.ylabel('Value')

plt.show()
No description has been provided for this image
Pie chart¶
In [ ]:
expensive_books_genre = expensive_books_sorted['Genre'].value_counts()
percent_non_fic_expensive = round((expensive_books_genre['Non Fiction'] / num_expensive_books) * 100)
percent_fic_expensive = round((expensive_books_genre['Fiction'] / num_expensive_books) * 100)

print(expensive_books_genre)

# pie chart of genre for expensive books
genre_percents_expensive = [percent_non_fic_expensive, percent_fic_expensive]
genre_labels_expensive = ['Non Fiction', 'Fiction']

plt.pie(genre_percents_expensive, labels=genre_labels_expensive, autopct='%1.0f%%', startangle=140)
plt.axis('equal')
plt.title('Expensive Books: Genre')

plt.show()
Non Fiction    89
Fiction        33
Name: Genre, dtype: int64
No description has been provided for this image
Highest priced book¶
In [ ]:
df.loc[df["Price"]==105]
Out[ ]:
Name Author User Rating Reviews Price Year Genre
69 Diagnostic and Statistical Manual of Mental Di... American Psychiatric Association 4.5 6679 105 2013 Non Fiction
70 Diagnostic and Statistical Manual of Mental Di... American Psychiatric Association 4.5 6679 105 2014 Non Fiction
Observations¶
  • The two highest priced books (Diagnostic and Statistics Manual... 105 USD, Twilight Saga 82 USD) are 50 USD more expensive than 75% of all the expensive books
  • A majority (73%) of expensive books are non fiction

Inexpensive Books¶

For our purposes we will define inexpesive books that are priced less than 7 USD since 25% of all of the books are less than that.

In [ ]:
inexpensive_books_sorted = df.loc[df["Price"]<=7].sort_values(by='Price', ascending=False)
num_inexpensive_books = len(inexpensive_books_sorted)

print(num_inexpensive_books, "books sold for 7 USD or less", )
inexpensive_books_sorted
148 books sold for 7 USD or less
Out[ ]:
Name Author User Rating Reviews Price Year Genre
147 Goodnight, Goodnight Construction Site (Hardco... Sherri Duskey Rinker 4.9 7038 7 2013 Fiction
399 The Handmaid's Tale Margaret Atwood 4.3 29442 7 2017 Fiction
367 The Fault in Our Stars John Green 4.7 50482 7 2014 Fiction
146 Goodnight, Goodnight Construction Site (Hardco... Sherri Duskey Rinker 4.9 7038 7 2012 Fiction
283 Quiet: The Power of Introverts in a World That... Susan Cain 4.6 10009 7 2013 Non Fiction
... ... ... ... ... ... ... ...
381 The Getaway Jeff Kinney 4.8 5836 0 2017 Fiction
116 Frozen (Little Golden Book) RH Disney 4.7 3642 0 2014 Fiction
42 Cabin Fever (Diary of a Wimpy Kid, Book 6) Jeff Kinney 4.8 4505 0 2011 Fiction
358 The Constitution of the United States Delegates of the Constitutional 4.8 2774 0 2016 Non Fiction
71 Diary of a Wimpy Kid: Hard Luck, Book 8 Jeff Kinney 4.8 6812 0 2013 Fiction

148 rows × 7 columns

Summary Statistics¶
In [ ]:
inexpensive_books_sorted['Price'].describe()
Out[ ]:
count    148.000000
mean       4.804054
std        1.883183
min        0.000000
25%        4.000000
50%        5.000000
75%        6.000000
max        7.000000
Name: Price, dtype: float64
Box plot¶
In [ ]:
plt.boxplot(inexpensive_books_sorted['Price'])
plt.title('Inexpensive Books (<=$7)')
plt.xlabel('Price')
plt.ylabel('Value')

plt.show()
No description has been provided for this image
Pie chart¶
In [ ]:
inexpensive_books_genre = inexpensive_books_sorted['Genre'].value_counts()

percent_non_fic_inexpensive = round((inexpensive_books_genre['Non Fiction'] / num_inexpensive_books) * 100)
percent_fic_inexpensive = round((inexpensive_books_genre['Fiction'] / num_inexpensive_books) * 100)

print(inexpensive_books_genre)

# pie chart of genre for inexpesnive books
genre_percents_inexpensive = [percent_non_fic_inexpensive, percent_fic_inexpensive]
genre_labels_inexpensive = ['Non Fiction', 'Fiction']

plt.pie(genre_percents_inexpensive, labels=genre_labels_inexpensive, autopct='%1.0f%%', startangle=140)
plt.axis('equal')
plt.title('Inexpensive Books: Genre')

plt.show()
Fiction        83
Non Fiction    65
Name: Genre, dtype: int64
No description has been provided for this image
Free books¶
In [ ]:
print(len(df.loc[df["Price"]==0]), "books sold for free")
df.loc[df["Price"]==0]
12 books sold for free
Out[ ]:
Name Author User Rating Reviews Price Year Genre
42 Cabin Fever (Diary of a Wimpy Kid, Book 6) Jeff Kinney 4.8 4505 0 2011 Fiction
71 Diary of a Wimpy Kid: Hard Luck, Book 8 Jeff Kinney 4.8 6812 0 2013 Fiction
116 Frozen (Little Golden Book) RH Disney 4.7 3642 0 2014 Fiction
193 JOURNEY TO THE ICE P RH Disney 4.6 978 0 2014 Fiction
219 Little Blue Truck Alice Schertle 4.9 1884 0 2014 Fiction
358 The Constitution of the United States Delegates of the Constitutional 4.8 2774 0 2016 Non Fiction
381 The Getaway Jeff Kinney 4.8 5836 0 2017 Fiction
461 The Short Second Life of Bree Tanner: An Eclip... Stephenie Meyer 4.6 2122 0 2010 Fiction
505 To Kill a Mockingbird Harper Lee 4.8 26234 0 2013 Fiction
506 To Kill a Mockingbird Harper Lee 4.8 26234 0 2014 Fiction
507 To Kill a Mockingbird Harper Lee 4.8 26234 0 2015 Fiction
508 To Kill a Mockingbird Harper Lee 4.8 26234 0 2016 Fiction
In [ ]:
print(len(df.loc[df["Price"]==1]), "book sold for 1 USD")
df.loc[df["Price"]==1]
1 book sold for 1 USD
Out[ ]:
Name Author User Rating Reviews Price Year Genre
91 Eat This Not That! Supermarket Survival Guide:... David Zinczenko 4.5 720 1 2009 Non Fiction
Observations¶
  • Some books appear multiple years in a row (e.g., 'To Kill a Mockingbird')
  • A majority (75%) of inexpensive books are priced between 4 USD and 7 USD
  • A slight majority (56%) of inexpensive books are fiction

User Rating¶

Bar graph¶

In [ ]:
plt.figure(figsize=(10, 6), dpi=80)
sns.countplot(x='User Rating', data=df)

plt.show()
No description has been provided for this image

Summary Statistics¶

In [ ]:
df['User Rating'].describe()
Out[ ]:
count    550.000000
mean       4.618364
std        0.226980
min        3.300000
25%        4.500000
50%        4.700000
75%        4.800000
max        4.900000
Name: User Rating, dtype: float64

Box plot¶

In [ ]:
plt.boxplot(df['User Rating'])
plt.title('Box Plot of User Rating')
plt.xlabel('User Rating')
plt.ylabel('Count')
plt.show()
No description has been provided for this image

Highest Rated Books¶

For our purposes, we will define the highest rated books as any books that have a 'User Rating' of 4.9 since most books (75%) have a rating below value.

In [ ]:
highest_rated_books = df.loc[df['User Rating']==4.9]
list_of_highest_rated_books = highest_rated_books['Name'].value_counts().index.tolist()

for book in list_of_highest_rated_books:
    print(book)
Oh, the Places You'll Go!
The Very Hungry Caterpillar
Jesus Calling: Enjoying Peace in His Presence (with Scripture References)
The Wonderful Things You Will Be
Goodnight, Goodnight Construction Site (Hardcover Books for Toddlers, Preschool Books for Kids)
Brown Bear, Brown Bear, What Do You See?
Dog Man: Brawl of the Wild: From the Creator of Captain Underpants (Dog Man #6)
Dog Man: Lord of the Fleas: From the Creator of Captain Underpants (Dog Man #5)
Dog Man: For Whom the Ball Rolls: From the Creator of Captain Underpants (Dog Man #7)
Unfreedom of the Press
Dog Man: A Tale of Two Kitties: From the Creator of Captain Underpants (Dog Man #3)
The Magnolia Story
The Legend of Zelda: Hyrule Historia
Strange Planet (Strange Planet Series)
Rush Revere and the First Patriots: Time-Travel Adventures With Exceptional Americans (2)
Rush Revere and the Brave Pilgrims: Time-Travel Adventures with Exceptional Americans (1)
Dog Man: Fetch-22: From the Creator of Captain Underpants (Dog Man #8)
Obama: An Intimate Portrait
Little Blue Truck
Last Week Tonight with John Oliver Presents A Day in the Life of Marlon Bundo (Better Bundo Book, LGBT Children’s Book)
Dog Man and Cat Kid: From the Creator of Captain Underpants (Dog Man #4)
Humans of New York : Stories
Harry Potter and the Sorcerer's Stone: The Illustrated Edition (Harry Potter, Book 1)
Harry Potter and the Prisoner of Azkaban: The Illustrated Edition (Harry Potter, Book 3)
Harry Potter and the Goblet of Fire: The Illustrated Edition (Harry Potter, Book 4) (4)
Harry Potter and the Chamber of Secrets: The Illustrated Edition (Harry Potter, Book 2)
Hamilton: The Revolution
Wrecking Ball (Diary of a Wimpy Kid Book 14)

Observations

  • 50% of User Rating are between 4.5 and 4.8
  • Outliers are less than 4.1

Reviews¶

In [ ]:
# create series for value counts
review_counts = df['Reviews'].value_counts()

# convert series to data frame
review_counts_df = pd.DataFrame({'num_reviews': review_counts.index, 'num_books': review_counts.values})

# create histogram of reviews
plt.figure(figsize=(10,6), dpi=80)
sns.histplot(data=review_counts_df['num_reviews'], bins='auto', binwidth=2000)
plt.title('Reviews')
plt.xlabel('Amount of Reviews')
plt.ylabel('Number of Books')

plt.show()
No description has been provided for this image
In [ ]:
review_counts_df['num_reviews'].describe()
Out[ ]:
count      346.000000
mean      9786.430636
std      10871.900146
min         37.000000
25%       3362.750000
50%       6361.500000
75%      11510.250000
max      87841.000000
Name: num_reviews, dtype: float64

Books with Most Number of Reviews¶

Since 75% of books have fewer than 11,510 reviews, we will define highly reviewed books as books that have more than 11,510 reviews.

In [ ]:
highest_reviews = df.loc[df['Reviews']>11510]
list_of_highest_reviews = highest_reviews['Name'].value_counts().index.tolist()

print(len(list_of_highest_reviews), 'book have greater than 11,510 reviews')

for book in list_of_highest_reviews:
    print(book)
88 book have greater than 11,510 reviews
Oh, the Places You'll Go!
The Very Hungry Caterpillar
The Four Agreements: A Practical Guide to Personal Freedom (A Toltec Wisdom Book)
Jesus Calling: Enjoying Peace in His Presence (with Scripture References)
First 100 Words
Wonder
Unbroken: A World War II Story of Survival, Resilience, and Redemption
To Kill a Mockingbird
The 5 Love Languages: The Secret to Love that Lasts
How to Win Friends & Influence People
Giraffes Can't Dance
The Life-Changing Magic of Tidying Up: The Japanese Art of Decluttering and Organizing
The Help
The Fault in Our Stars
You Are a Badass: How to Stop Doubting Your Greatness and Start Living an Awesome Life
The Great Gatsby
Catching Fire (The Hunger Games)
Mockingjay (The Hunger Games)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Player's Handbook (Dungeons & Dragons)
Gone Girl
Milk and Honey
The Book Thief
The Art of Racing in the Rain: A Novel
The Hunger Games (Book 1)
The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics
Love You Forever
The Hunger Games Trilogy Boxed Set (1)
The Complete Ketogenic Diet for Beginners: Your Essential Guide to Living the Keto Lifestyle
School Zone - Big Preschool Workbook - Ages 4 and Up, Colors, Shapes, Numbers 1-10, Alphabet, Pre-Writing, Pre-Reading…
Ready Player One: A Novel
Proof of Heaven: A Neurosurgeon's Journey into the Afterlife
All the Light We Cannot See
A Man Called Ove: A Novel
The Nightingale: A Novel
The Girl on the Train
The Goldfinch: A Novel (Pulitzer Prize for Fiction)
Becoming
The Shack: Where Tragedy Confronts Eternity
Brown Bear, Brown Bear, What Do You See?
If Animals Kissed Good Night
Hillbilly Elegy: A Memoir of a Family and Culture in Crisis
Educated: A Memoir
Heaven is for Real: A Little Boy's Astounding Story of His Trip to Heaven and Back
The Wonky Donkey
Fifty Shades of Grey: Book One of the Fifty Shades Trilogy (Fifty Shades of Grey Series)
Girl, Wash Your Face: Stop Believing the Lies About Who You Are So You Can Become Who You Were Meant to Be
Divergent
The Guardians: A Novel
The Hunger Games
1984 (Signet Classics)
The Handmaid's Tale
Wild: From Lost to Found on the Pacific Crest Trail
Where the Crawdads Sing
When Breath Becomes Air
Twilight (The Twilight Saga, Book 1)
A Dance with Dragons (A Song of Ice and Fire)
The Racketeer
A Game of Thrones / A Clash of Kings / A Storm of Swords / A Feast of Crows / A Dance with Dragons
A Gentleman in Moscow: A Novel
The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness
The Martian
The Silent Patient
Divergent / Insurgent
Sycamore Row (Jake Brigance)
American Sniper: The Autobiography of the Most Lethal Sniper in U.S. Military History
Harry Potter Paperback Box Set (Books 1-7)
Dog Man: Fetch-22: From the Creator of Captain Underpants (Dog Man #8)
Fifty Shades Darker
Fifty Shades Freed: Book Three of the Fifty Shades Trilogy (Fifty Shades of Grey Series) (English Edition)
Fifty Shades Trilogy (Fifty Shades of Grey / Fifty Shades Darker / Fifty Shades Freed)
Fire and Fury: Inside the Trump White House
Go Set a Watchman: A Novel
Grey: Fifty Shades of Grey as Told by Christian (Fifty Shades of Grey Series)
Harry Potter and the Chamber of Secrets: The Illustrated Edition (Harry Potter, Book 2)
Harry Potter and the Cursed Child, Parts 1 & 2, Special Rehearsal Edition Script
Can't Hurt Me: Master Your Mind and Defy the Odds
And the Mountains Echoed
Inferno
Last Week Tonight with John Oliver Presents A Day in the Life of Marlon Bundo (Better Bundo Book, LGBT Children’s Book)
Little Fires Everywhere
12 Rules for Life: An Antidote to Chaos
Origin: A Novel (Robert Langdon)
Orphan Train
Doctor Sleep: A Novel
The Alchemist
The Body Keeps the Score: Brain, Mind, and Body in the Healing of Trauma
10-Day Green Smoothie Cleanse
Top 10 Highly Reviwed Books¶
In [ ]:
highest_reviews_sorted = highest_reviews.sort_values(by='Reviews', ascending=False)
list_of_top_10_highest_reviews = highest_reviews_sorted['Name'].value_counts().head(10).index.tolist()

for book in list_of_top_10_highest_reviews:
    print(book)
Oh, the Places You'll Go!
The Very Hungry Caterpillar
Jesus Calling: Enjoying Peace in His Presence (with Scripture References)
The Four Agreements: A Practical Guide to Personal Freedom (A Toltec Wisdom Book)
How to Win Friends & Influence People
The 5 Love Languages: The Secret to Love that Lasts
Wonder
First 100 Words
To Kill a Mockingbird
Giraffes Can't Dance

Observations

Most books (75%) have fewer than 12,000 reviews.

Author¶

Let's investigate how many times an author published between 2009-2019 and which authors published the most during this time period.

In [ ]:
author_counts = df["Author"].value_counts()
print(author_counts)
Jeff Kinney                           12
Gary Chapman                          11
Rick Riordan                          11
Suzanne Collins                       11
American Psychological Association    10
                                      ..
Keith Richards                         1
Chris Cleave                           1
Alice Schertle                         1
Celeste Ng                             1
Adam Gasiewski                         1
Name: Author, Length: 248, dtype: int64

Summary Statistics¶

In [ ]:
author_counts.describe()
Out[ ]:
count    248.000000
mean       2.217742
std        2.046268
min        1.000000
25%        1.000000
50%        1.000000
75%        2.000000
max       12.000000
Name: Author, dtype: float64

Observations

  • 75% of the authors on the list published only twice over the 10-year period of 2009-2019
  • 25% of the authors published more than twice

Box plot¶

In this box plot, we can see how many times a bestselling author published a book between 2009 and 2019.

In [ ]:
plt.boxplot(author_counts)
plt.title('Publishing Frequency of Authors')
plt.xlabel('Author')
plt.ylabel('Number of Published Entries')

plt.show()
No description has been provided for this image

Most Frequent Authors¶

For our purposes, we will define the most frequent authors as those who published more than twice.

In [ ]:
mask_authors = author_counts>2
authors_most_frequent = author_counts[mask_authors]

print(len(authors_most_frequent), 'authors published more than two books from 2009-2019')
print(authors_most_frequent)
58 authors published more than two books from 2009-2019
Jeff Kinney                           12
Gary Chapman                          11
Rick Riordan                          11
Suzanne Collins                       11
American Psychological Association    10
Dr. Seuss                              9
Gallup                                 9
Rob Elliott                            8
Stephen R. Covey                       7
Stephenie Meyer                        7
Dav Pilkey                             7
Bill O'Reilly                          7
Eric Carle                             7
The College Board                      6
E L James                              6
Don Miguel Ruiz                        6
J.K. Rowling                           6
Stieg Larsson                          6
Sarah Young                            6
Harper Lee                             6
Laura Hillenbrand                      5
R. J. Palacio                          5
Dale Carnegie                          5
Patrick Lencioni                       5
Giles Andreae                          5
Roger Priddy                           5
John Green                             5
John Grisham                           5
Marie Kondō                            4
Rupi Kaur                              4
Rod Campbell                           4
Charlaine Harris                       4
Jim Collins                            4
Kathryn Stockett                       4
Emily Winfield Martin                  4
Stephen King                           4
Jen Sincero                            4
Malcolm Gladwell                       4
Thug Kitchen                           4
Veronica Roth                          4
Glenn Beck                             3
Drew Daywalt                           3
Rachel Hollis                          3
Walter Isaacson                        3
Gillian Flynn                          3
Dan Brown                              3
Rebecca Skloot                         3
Melissa Hartwig Urban                  3
Mark Manson                            3
Margaret Wise Brown                    3
George R.R. Martin                     3
F. Scott Fitzgerald                    3
Wizards RPG Team                       3
Ina Garten                             3
Brandon Stanton                        3
Ree Drummond                           3
Carol S. Dweck                         3
Francis Chan                           3
Name: Author, dtype: int64
In [ ]:
plt.figure(figsize=(8, 6), dpi=80)
sns.histplot(data=authors_most_frequent, bins='auto', binwidth=1)

plt.title('Most Frequent Authors')
plt.xlabel('Number of Times Published')
plt.ylabel('Number of Authors')

plt.show()
No description has been provided for this image

Top 10 Authors¶

In [ ]:
top_ten_authors = authors_most_frequent.head(10)
print(top_ten_authors)
Jeff Kinney                           12
Gary Chapman                          11
Rick Riordan                          11
Suzanne Collins                       11
American Psychological Association    10
Dr. Seuss                              9
Gallup                                 9
Rob Elliott                            8
Stephen R. Covey                       7
Stephenie Meyer                        7
Name: Author, dtype: int64
Bar plot¶
In [ ]:
# convert series to data frame
top_ten_authors_df = pd.DataFrame({'Author': top_ten_authors.index, 'Count': top_ten_authors.values})

# create bar plot of data frame
plt.figure(figsize=(10,8))
sns.barplot(data=top_ten_authors_df, x='Author', y='Count')
plt.title('Top 10 Authors')
plt.xlabel('Author')
plt.xticks(rotation=90)
plt.ylabel('Count')

plt.show()
No description has been provided for this image
In [ ]:
list_of_top_ten_authors = top_ten_authors.index.tolist()

df_top_10_authors = df[df['Author'].isin(list_of_top_ten_authors)]

df_top_10_authors
Out[ ]:
Name Author User Rating Reviews Price Year Genre
38 Breaking Dawn (The Twilight Saga, Book 4) Stephenie Meyer 4.6 9769 13 2009 Fiction
42 Cabin Fever (Diary of a Wimpy Kid, Book 6) Jeff Kinney 4.8 4505 0 2011 Fiction
46 Catching Fire (The Hunger Games) Suzanne Collins 4.7 22614 11 2010 Fiction
47 Catching Fire (The Hunger Games) Suzanne Collins 4.7 22614 11 2011 Fiction
48 Catching Fire (The Hunger Games) Suzanne Collins 4.7 22614 11 2012 Fiction
... ... ... ... ... ... ... ...
473 The Twilight Saga Collection Stephenie Meyer 4.7 3801 82 2009 Fiction
474 The Ugly Truth (Diary of a Wimpy Kid, Book 5) Jeff Kinney 4.8 3796 12 2010 Fiction
513 Twilight (The Twilight Saga, Book 1) Stephenie Meyer 4.7 11676 9 2009 Fiction
528 What Pet Should I Get? (Classic Seuss) Dr. Seuss 4.7 1873 14 2015 Fiction
545 Wrecking Ball (Diary of a Wimpy Kid Book 14) Jeff Kinney 4.9 9413 8 2019 Fiction

95 rows × 7 columns

In [ ]:
# book titles for top 10 authors
df_top_10_authors['Name'].value_counts()
Out[ ]:
Publication Manual of the American Psychological Association, 6th Edition       10
StrengthsFinder 2.0                                                              9
Oh, the Places You'll Go!                                                        8
The 7 Habits of Highly Effective People: Powerful Lessons in Personal Change     7
The 5 Love Languages: The Secret to Love that Lasts                              5
The 5 Love Languages: The Secret to Love That Lasts                              5
Laugh-Out-Loud Jokes for Kids                                                    5
Catching Fire (The Hunger Games)                                                 3
Knock-Knock Jokes for Kids                                                       3
Mockingjay (The Hunger Games)                                                    3
The Hunger Games Trilogy Boxed Set (1)                                           2
The Last Olympian (Percy Jackson and the Olympians, Book 5)                      2
The Hunger Games (Book 1)                                                        2
The Lost Hero (Heroes of Olympus, Book 1)                                        1
The Mark of Athena (Heroes of Olympus, Book 3)                                   1
The Meltdown (Diary of a Wimpy Kid Book 13)                                      1
Breaking Dawn (The Twilight Saga, Book 4)                                        1
The Red Pyramid (The Kane Chronicles, Book 1)                                    1
The Serpent's Shadow (The Kane Chronicles, Book 3)                               1
The Son of Neptune (Heroes of Olympus, Book 2)                                   1
The Third Wheel (Diary of a Wimpy Kid, Book 7)                                   1
The Throne of Fire (The Kane Chronicles, Book 2)                                 1
The Twilight Saga Collection                                                     1
The Ugly Truth (Diary of a Wimpy Kid, Book 5)                                    1
Twilight (The Twilight Saga, Book 1)                                             1
What Pet Should I Get? (Classic Seuss)                                           1
The Short Second Life of Bree Tanner: An Eclipse Novella (The Twilight Saga)     1
The Blood of Olympus (The Heroes of Olympus (5))                                 1
The Hunger Games                                                                 1
The House of Hades (Heroes of Olympus, Book 4)                                   1
The Getaway                                                                      1
The Five Love Languages: How to Express Heartfelt Commitment to Your Mate        1
Cabin Fever (Diary of a Wimpy Kid, Book 6)                                       1
Percy Jackson and the Olympians Paperback Boxed Set (Books 1-3)                  1
Old School (Diary of a Wimpy Kid #10)                                            1
New Moon (The Twilight Saga)                                                     1
Eclipse (Twilight)                                                               1
Eclipse (Twilight Sagas)                                                         1
Double Down (Diary of a Wimpy Kid #11)                                           1
Dog Days (Diary of a Wimpy Kid, Book 4) (Volume 4)                               1
Diary of a Wimpy Kid: The Long Haul                                              1
Diary of a Wimpy Kid: The Last Straw (Book 3)                                    1
Diary of a Wimpy Kid: Hard Luck, Book 8                                          1
Wrecking Ball (Diary of a Wimpy Kid Book 14)                                     1
Name: Name, dtype: int64
Top Author: Jeff Kinney¶
In [ ]:
df.loc[df['Author']=='Jeff Kinney']
Out[ ]:
Name Author User Rating Reviews Price Year Genre
42 Cabin Fever (Diary of a Wimpy Kid, Book 6) Jeff Kinney 4.8 4505 0 2011 Fiction
71 Diary of a Wimpy Kid: Hard Luck, Book 8 Jeff Kinney 4.8 6812 0 2013 Fiction
72 Diary of a Wimpy Kid: The Last Straw (Book 3) Jeff Kinney 4.8 3837 15 2009 Fiction
73 Diary of a Wimpy Kid: The Long Haul Jeff Kinney 4.8 6540 22 2014 Fiction
80 Dog Days (Diary of a Wimpy Kid, Book 4) (Volum... Jeff Kinney 4.8 3181 12 2009 Fiction
88 Double Down (Diary of a Wimpy Kid #11) Jeff Kinney 4.8 5118 20 2016 Fiction
253 Old School (Diary of a Wimpy Kid #10) Jeff Kinney 4.8 6169 7 2015 Fiction
381 The Getaway Jeff Kinney 4.8 5836 0 2017 Fiction
435 The Meltdown (Diary of a Wimpy Kid Book 13) Jeff Kinney 4.8 5898 8 2018 Fiction
468 The Third Wheel (Diary of a Wimpy Kid, Book 7) Jeff Kinney 4.7 6377 7 2012 Fiction
474 The Ugly Truth (Diary of a Wimpy Kid, Book 5) Jeff Kinney 4.8 3796 12 2010 Fiction
545 Wrecking Ball (Diary of a Wimpy Kid Book 14) Jeff Kinney 4.9 9413 8 2019 Fiction
In [ ]:
for author in list_of_top_ten_authors:
    print(author)
Jeff Kinney
Gary Chapman
Rick Riordan
Suzanne Collins
American Psychological Association
Dr. Seuss
Gallup
Rob Elliott
Stephen R. Covey
Stephenie Meyer

Observations

  • 75% of authors only published twice during the time period
  • The Top 10 most frequent authors are:
    • Jeff Kinney
    • Gary Chapman
    • Rick Riordan
    • Suzanne Collins
    • American Psychological Association
    • Dr. Seuss
    • Gallup
    • Rob Elliott
    • Stephen R. Covey
    • Stephenie Meyer

Book Title¶

Let's investigate the most frequent book titles between 2009-2019.

In [ ]:
book_titles = df["Name"].value_counts()

print(book_titles)
Publication Manual of the American Psychological Association, 6th Edition       10
StrengthsFinder 2.0                                                              9
Oh, the Places You'll Go!                                                        8
The Very Hungry Caterpillar                                                      7
The 7 Habits of Highly Effective People: Powerful Lessons in Personal Change     7
                                                                                ..
Humans of New York : Stories                                                     1
Howard Stern Comes Again                                                         1
Homebody: A Guide to Creating Spaces You Never Want to Leave                     1
Have a Little Faith: A True Story                                                1
Night (Night)                                                                    1
Name: Name, Length: 351, dtype: int64

Summary Statistics¶

In [ ]:
book_titles.describe()
Out[ ]:
count    351.000000
mean       1.566952
std        1.271868
min        1.000000
25%        1.000000
50%        1.000000
75%        2.000000
max       10.000000
Name: Name, dtype: float64

Box plot¶

In [ ]:
plt.boxplot(book_titles)
plt.title('Publishing Frequency of Book Titles')
plt.xlabel('Book')
plt.ylabel('Number of Published Entries')

plt.show()
No description has been provided for this image

Observations

Similar to the frequency of author names, 75% of book titles appear 2 or fewer times. Meaning, that 25% of book titles appear 3 or more times during the time period of 2009-2019.

Most Frequent Book Titles¶

Based on our previous observation, we will define 'most frequent' as any book title that appears 3 or more times during the time period of 2009-2019.

In [ ]:
mask_book_tiles = book_titles > 2
books_most_frequent = book_titles[mask_book_tiles]

print(len(books_most_frequent), 'book titles appear 3 or more times')
print(books_most_frequent)
41 book titles appear 3 or more times
Publication Manual of the American Psychological Association, 6th Edition                            10
StrengthsFinder 2.0                                                                                   9
Oh, the Places You'll Go!                                                                             8
The Very Hungry Caterpillar                                                                           7
The 7 Habits of Highly Effective People: Powerful Lessons in Personal Change                          7
The Four Agreements: A Practical Guide to Personal Freedom (A Toltec Wisdom Book)                     6
Jesus Calling: Enjoying Peace in His Presence (with Scripture References)                             6
The Official SAT Study Guide                                                                          5
To Kill a Mockingbird                                                                                 5
The 5 Love Languages: The Secret to Love That Lasts                                                   5
The 5 Love Languages: The Secret to Love that Lasts                                                   5
Laugh-Out-Loud Jokes for Kids                                                                         5
How to Win Friends & Influence People                                                                 5
Unbroken: A World War II Story of Survival, Resilience, and Redemption                                5
The Five Dysfunctions of a Team: A Leadership Fable                                                   5
Giraffes Can't Dance                                                                                  5
Wonder                                                                                                5
First 100 Words                                                                                       5
The Fault in Our Stars                                                                                4
Dear Zoo: A Lift-the-Flap Book                                                                        4
The Wonderful Things You Will Be                                                                      4
The Life-Changing Magic of Tidying Up: The Japanese Art of Decluttering and Organizing                4
Good to Great: Why Some Companies Make the Leap and Others Don't                                      4
Thug Kitchen: The Official Cookbook: Eat Like You Give a F*ck (Thug Kitchen Cookbooks)                4
The Help                                                                                              4
You Are a Badass: How to Stop Doubting Your Greatness and Start Living an Awesome Life                4
Knock-Knock Jokes for Kids                                                                            3
Catching Fire (The Hunger Games)                                                                      3
Game of Thrones Boxed Set: A Game of Thrones/A Clash of Kings/A Storm of Swords/A Feast for Crows     3
Gone Girl                                                                                             3
The Day the Crayons Quit                                                                              3
Goodnight Moon                                                                                        3
The Immortal Life of Henrietta Lacks                                                                  3
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life                3
Player's Handbook (Dungeons & Dragons)                                                                3
Milk and Honey                                                                                        3
The Whole30: The 30-Day Guide to Total Health and Food Freedom                                        3
Mindset: The New Psychology of Success                                                                3
Crazy Love: Overwhelmed by a Relentless God                                                           3
Mockingjay (The Hunger Games)                                                                         3
The Great Gatsby                                                                                      3
Name: Name, dtype: int64
In [ ]:
plt.figure(figsize=(8,6), dpi=80)
sns.histplot(data=books_most_frequent, bins='auto', binwidth=1)

plt.title('Most Frequent Book Titles')
plt.xlabel('Number of Times Published')
plt.ylabel('Number of Book Titles')

plt.show()
No description has been provided for this image

Top 10 Book Titles¶

In [ ]:
top_ten_books = books_most_frequent.head(10)
print(top_ten_books)
Publication Manual of the American Psychological Association, 6th Edition            10
StrengthsFinder 2.0                                                                   9
Oh, the Places You'll Go!                                                             8
The Very Hungry Caterpillar                                                           7
The 7 Habits of Highly Effective People: Powerful Lessons in Personal Change          7
The Four Agreements: A Practical Guide to Personal Freedom (A Toltec Wisdom Book)     6
Jesus Calling: Enjoying Peace in His Presence (with Scripture References)             6
The Official SAT Study Guide                                                          5
To Kill a Mockingbird                                                                 5
The 5 Love Languages: The Secret to Love That Lasts                                   5
Name: Name, dtype: int64
Bar plot¶
In [ ]:
# convert series to data frame
top_ten_books_df = pd.DataFrame({'Name': top_ten_books.index, 'Count': top_ten_books.values})

# create bar plot of data frame
plt.figure(figsize=(10,8))
sns.barplot(data=top_ten_books_df, x='Name', y='Count')
plt.title('Top 10 Books')
plt.xlabel('Book Title')
plt.xticks(rotation=90)
plt.ylabel('Count')

plt.show()
No description has been provided for this image
In [ ]:
list_of_top_ten_books = top_ten_books.index.tolist()

df_top_10_books = df[df['Name'].isin(list_of_top_ten_books)]

df_top_10_books
Out[ ]:
Name Author User Rating Reviews Price Year Genre
187 Jesus Calling: Enjoying Peace in His Presence ... Sarah Young 4.9 19576 8 2011 Non Fiction
188 Jesus Calling: Enjoying Peace in His Presence ... Sarah Young 4.9 19576 8 2012 Non Fiction
189 Jesus Calling: Enjoying Peace in His Presence ... Sarah Young 4.9 19576 8 2013 Non Fiction
190 Jesus Calling: Enjoying Peace in His Presence ... Sarah Young 4.9 19576 8 2014 Non Fiction
191 Jesus Calling: Enjoying Peace in His Presence ... Sarah Young 4.9 19576 8 2015 Non Fiction
... ... ... ... ... ... ... ...
505 To Kill a Mockingbird Harper Lee 4.8 26234 0 2013 Fiction
506 To Kill a Mockingbird Harper Lee 4.8 26234 0 2014 Fiction
507 To Kill a Mockingbird Harper Lee 4.8 26234 0 2015 Fiction
508 To Kill a Mockingbird Harper Lee 4.8 26234 0 2016 Fiction
509 To Kill a Mockingbird Harper Lee 4.8 26234 7 2019 Fiction

68 rows × 7 columns

Top Book: Publication Manual of the American Psychological Association, 6th Edition¶

In [ ]:
df.loc[df['Name']=='Publication Manual of the American Psychological Association, 6th Edition']
Out[ ]:
Name Author User Rating Reviews Price Year Genre
271 Publication Manual of the American Psychologic... American Psychological Association 4.5 8580 46 2009 Non Fiction
272 Publication Manual of the American Psychologic... American Psychological Association 4.5 8580 46 2010 Non Fiction
273 Publication Manual of the American Psychologic... American Psychological Association 4.5 8580 46 2011 Non Fiction
274 Publication Manual of the American Psychologic... American Psychological Association 4.5 8580 46 2012 Non Fiction
275 Publication Manual of the American Psychologic... American Psychological Association 4.5 8580 46 2013 Non Fiction
276 Publication Manual of the American Psychologic... American Psychological Association 4.5 8580 46 2014 Non Fiction
277 Publication Manual of the American Psychologic... American Psychological Association 4.5 8580 46 2015 Non Fiction
278 Publication Manual of the American Psychologic... American Psychological Association 4.5 8580 46 2016 Non Fiction
279 Publication Manual of the American Psychologic... American Psychological Association 4.5 8580 46 2017 Non Fiction
280 Publication Manual of the American Psychologic... American Psychological Association 4.5 8580 46 2018 Non Fiction
In [ ]:
for book in list_of_top_ten_books:
    print(book)
Publication Manual of the American Psychological Association, 6th Edition
StrengthsFinder 2.0
Oh, the Places You'll Go!
The Very Hungry Caterpillar
The 7 Habits of Highly Effective People: Powerful Lessons in Personal Change
The Four Agreements: A Practical Guide to Personal Freedom (A Toltec Wisdom Book)
Jesus Calling: Enjoying Peace in His Presence (with Scripture References)
The Official SAT Study Guide
To Kill a Mockingbird
The 5 Love Languages: The Secret to Love That Lasts

Observations

Top 10 most common book titles:

  • Publication Manual of the American Psychological Association, 6th - Edition
  • StrengthsFinder 2.0
  • Oh, the Places You'll Go!
  • The Very Hungry Caterpillar
  • The 7 Habits of Highly Effective People: Powerful Lessons in Personal - Change
  • The Four Agreements: A Practical Guide to Personal Freedom (A Toltec - Wisdom Book)
  • Jesus Calling: Enjoying Peace in His Presence (with Scripture - References)
  • The Official SAT Study Guide
  • To Kill a Mockingbird
  • The 5 Love Languages: The Secret to Love That Lasts

Analysis Conclusions¶

After the analysis, it's time to reflect on what was observed in the data. Let's look at our investigative questions and the answers yielded by our analysis.

Initial Questions¶

Our analysis has yield answers to our intial questions from the Ask Phase:

  • What author(s) published the most between 2009-2019?
    • Jeff Kinney is the most common author to appear with his series 'Diarty of a Wimpy Kid'
    • The top 10 authors who appeared the most published at least 7 times during the 10-year period
  • What percentage of books were fiction vs. non-ficiton?
    • Non-fiction: 56%
      • A majority (73%) of expensive books are non-fiction; the higher the price a big is, the more likely it is to be non-fiction
    • Fiction: 44%
  • What book appears the most between 2009-2019?
    • Top book (appearing 10 times, every year): 'Publication Manual of the American Psychological Association, 6th Edition' publsihed by the American Pscyhological Association
    • The top 10 books include:
      • Publication Manual of the American Psychological Association, 6th Edition
      • StrengthsFinder 2.0
      • Oh, the Places You'll Go!
      • The Very Hungry Caterpillar
      • The 7 Habits of Highly Effective People: Powerful Lessons in Personal Change
      • The Four Agreements: A Practical Guide to Personal Freedom (A Toltec Wisdom Book)
      • Jesus Calling: Enjoying Peace in His Presence (with Scripture References)
      • The Official SAT Study Guide
      • To Kill a Mockingbird
      • The 5 Love Languages: The Secret to Love That Lasts

Other Observations¶

  • Most (75%) bestselling books are 16 USD or less with 50% of books priced between 7 USD and 16 USD
  • There's little or no correllation between User Rating, Price, and Reviews.
  • A slim majority (56%) of books are non-fiction
  • 73% of expensive books (greater tan 16 USD) are non-fiction
  • 75% of expensive books (greater than 16 USD) are priced between 17 USD and 30 USD
  • A slim majority (56%) of inexpensive books (less than 17 USD) are fiction
  • 75% of User Ratings are 4.5 or above with 50% between 4.5 and 4.8
  • 75% of authors only publsihed twice during the 10-year period

References¶

  • Data Source: Kaggle
  • Why Does The Amazon Book Market Share Dominate The Market?

Contact¶

  • Email
  • LinkedIn
  • GitHub
  • Portfolio