How to find similar books like a boss?

alt text

Ever finished a great book and struggled to find your next literary adventure? You’re not alone. With so many books out there, finding similar reads can feel like a daunting task. Like many book lovers, I wondered if there was a better way to discover new reads based on the books I already enjoyed.

I began thinking about how to cluster similar books based on their content. Should I read all the books myself and create lists based on my understanding? This approach does not seem feasible in only one life time. Certainly, there are already plenty of websites that offer suggestions based on user preferences.

The Limitations of Existing Book Recommendation Systems

GoodReads, for example, uses user-based filtering. If you like The Three Musketeers, the site might suggest The Count of Monte Cristo because many readers who enjoyed the former also liked the latter. Essentially, this is a form of basket analysis that creates association rules, recommending books based on the reading habits of others. Amazon, which owns GoodReads, uses a similar approach for product recommendations.

While this method works for popular books, it can be limiting. The system is subjective, meaning the more popular a book is, the more likely it will appear in your suggestions — whether it truly fits your taste or not. Niche or lesser-known books often slip through the cracks.

We all know that sometimes the perfect book is one that caters to very specific interests. And those books can be hard to recommend based on popularity alone. Often, we rely on trusted readers or personal recommendations to find these hidden gems.

A New Approach to Book Recommendations

It would be incredibly useful to automate the process of book recommendation based on objective criteria, especially for niche content.

I’ve been working on a book recommendation tool called findsimilarbooks.com and would love for you to try it out! The idea came to me after constantly searching for similar books after finishing ones I loved. Usually I would explore genres, authors, or Goodreads shelves to find similar ones. However, I wanted something a bit more unbiased and unique.

That’s why I built this tool using Latent Dirichlet Allocation (LDA) for topic modeling! The aim is to find similar books based on underlying themes and topics, not just popular recommendations or what’s trending in a specific genre. The inspiration for using LDA came from a great video on topic modeling (for those curious: YouTube video).

Here’s what makes this tool different: - Unbiased Recommendations — Not tied to genres, authors, or what’s trending. - Topic-Based — The system analyzes the book summaries and titles to uncover hidden themes and similarities. - Large Dataset — The database currently includes about 1 million books from Goodreads summaries and titles, and I plan to keep it growing!

I was also inspired by the idea shared here on reddit and Nathan Rooy’s awesome visual book recommender.

Right now, the tool is still in the development phase, so I would really appreciate any feedback. Play around with it, test the recommendations, and let me know how it feels. Your insights will help me improve it!

How It Works on FindSimilarBooks

Currently, on findsimilarbooks.com, you can use a simple keyword search. You type in a book you’ve already read (e.g. The three musketeers by Alexandre Dumas) and select it:

alt text

The system finds books based on the content, not just user preferences or popularity and below you can see suggestions based on the summaries of these books for The Three Musketeers:

alt text

It’s a more targeted way of finding your next read, especially if you’re into niche genres or obscure titles that might not be widely recognized.

Let’s try searching next something from an engineering topic like Skunk Works by Ben R. Rich:

alt text

And then let’s select the top left result for Skunk Works to see recommendations:

alt text

As you can see all the recommended books are about planes. Which suggests that the topic modeling works here.

What’s Next?

I’m excited to keep improving the site, and I’d love to hear your feedback. You can try out the tool at findsimilarbooks.com and see what books it suggests for you. Your thoughts, feature suggestions, or bug reports are more than welcome on GitHub.

Let me know what you think and happy reading!

Notes regarding known limitations/issues:

  • Since the summary is the input for the model it depends on how well it describes the content of a book. The same book can have different summaries highlighting different scenes thus based on these summaries there could be different recommendations for the same book.
  • The key word search currently returns a lot of duplicate results since there are a lot of different covers for the same book. For now I wanted the user to decide which cover he/she wants to explore since some of the books are not yet processed by the recommendation model. -Currently there are around 1 million books in the database and I am planning to grow this number as far as it can get.

Welcome!

Welcome to my website. Here I share my knowledge, projects and interests.