Improved Search for Backblaze’s Blog

Improved Search for Backblaze's Blog
Search has become the most powerful method to find content on the Web, both for finding websites themselves and for discovering information within websites. Our blog readers find content in both ways — using Google, Bing, Yahoo, Ask, DuckDuckGo, and other search engines to follow search results directly to our blog, and using the site search function once on our blog to find content in the blog posts themselves.

There’s a Lot of Great Content on the Backblaze Blog

Backblaze’s CEO Gleb Budman wrote the first post for this blog in March of 2008. Since that post there have been 612 more. There’s a lot of great content on this blog, as evidenced by the more than two million page views we’ve had since the beginning of this year. We typically publish two blog posts per week on a variety of topics, but we focus primarily on cloud storage technology and data backup, company news, and how-to articles (we call “What’s the Diff?) on how to use cloud storage and various hardware and software solutions.

Earlier this year we initiated a series of posts on entrepreneurship by our CEO and co-founder, Gleb Budman, which has proven tremendously popular. We also occasionally publish something a little lighter, such as our current Halloween video contest — there’s still time to enter!

Blog search box

The Site Search Box — Your gateway to Backblaze blog content

We Could do a Better Job of Helping You Find It

I joined Backblaze as Content Director in July of this year. During the application process, I spent quite a bit of time reading through the blog to understand the company, the market, and its customers. That’s a lot of reading. I used the site search many times to uncover topics and posts, and discovered that site search had a number of weaknesses that made it less-than-easy to find what I was looking for.

These site search weaknesses included:

Searches were case sensitive
— Visitor could easily miss content capitalized differently than the search terms
Results showed no date or author information
— Visitor couldn’t tell how recent the post was or who wrote it
Search terms were not highlighted in context
— Visitor had to scrutinize the results to find the terms in the post
No indication of the number of results or number of pages of results
— Visitor didn’t know how fruitful the search was
No record of search terms used by visitors
— Backblaze didn’t know what visitors were searching for on its blog

I wanted to make it easier for blog visitors to find all the great content on the Backblaze blog and help me understand what our visitors are searching for. To do that, we needed to upgrade our site search.

I started with a list of goals I wanted for site search.

  1. Make it easier to find content on the blog
  2. Provide a summary of what was found
  3. Search the comments as well as the posts
  4. Highlight the search terms in the results to help find them in context
  5. Provide a record of searches to help me understand what interests our readers

I had the goals, now how could I find a solution to achieve them?

Our blog is built on WordPress, which has a built-in site search function that could be described as simply adequate. The most obvious of its limitations is that search results are listed chronologically, not based on “most popular,” most occurring,” or any other metric that might make the result more relevant to your interests.

The Search for Improved (Site) Search

An obvious choice to improve site search would be to adopt Google Site Search, as many websites and blogs have done. Unfortunately, I quickly discovered that Google is sunsetting Site Search by April of 2018. That left the choice among a number of search services or WordPress-specific solutions. My immediate inclination was to see what is available specifically for WordPress.

There are a handful of search plugins for WordPress. One stood out to me for the number of installations (100,000+) and overwhelmingly high reviews: Relevanssi. Still, I had a number of questions. The first question was whether the plugin retained any search data from our site — I wanted to make sure that the privacy of our visitors is maintained, and even harvesting anonymous search data would not be acceptable to Backblaze. I wrote to the developer and was pleased by the responsiveness from Relevanssi’s creator, Mikko Saari. He explained to me that Relevanssi doesn’t have access to any of the search data from the sites using his plugin. Receiving a quick response from a developer is always a good sign. Other signs of a good WordPress plugin are recent updates and an active support forum. Relevanssi had both of these.

Our solution: Relevanssi for Site Search

The WordPress plugin Relevanssi met all of our criteria, so we installed the plugin and switched to using it for site search in September.

In addition to solving the problems listed above, our search results are now displayed based on relevance, then date (date only is the behavior of WordPress search). That capability is very useful on our blog where a lot of the content from years ago is still valuable — also known as evergreen content.

The new site search also enables visitors to search using Boolean operators with keywords, + for AND and – for NOT. For example, a visitor can search for

seagate drive

and see results that contain both words, or

-seagate drive

and see results that include the search term drive without the term seagate.

Visitors can put search terms in quotation marks to search for an entire phrase. For example, a visitor can search for “2016 drive stats” and see results that include only that exact phrase. In addition, the site search results come with a summary, showing where the results were found (title, post, or comments). Search terms are highlighted in yellow in the content, showing exactly where the search result was found.

screenshot of relevannssi wordpress search results

Search results showing total number of results, hits and their location, and highlighted search terms in context

The Results Tell the Story

Since initiating the new search on our blog on September 4, there have been almost 23,000 site searches conducted, so we know you are using it. We’ve implemented pagination for the blog feed and search results so you know how many pages of results there are and made it easier to navigate to them.

Now that we have this site search data, you likely are wondering which are the most popular search terms on our blog. Here are some of the top searches:

Here’s an example of a popular post that shows up in searches. Hard Drive Stats for Q1 2017 was published on May 9, 2017. Since September 4, it has shown up over 150 times in site searches and in the last 90 days in has been viewed over 53,000 times on our blog.https://www.backblaze.com/blog/hard-drive-failure-rates-q1-2017/

What Do You Search For?

Please tell us how you use site search and whether there are any other capabilities you’d like to see that would make it easier to find content on our blog.


Update November 8, 2017

We’ve added “Did you Mean?” to search results, which will attempt to guess what was intended in the search if none or few results are found due to a misspelling or misformation of the search terms. A spelling correction algorithm is used to compensate for misspelled words.

You can try it out by searching for seegate. We hope this update further enhances the usefulness of our site search.

print

About Roderick Bauer

Roderick has held marketing, engineering, and product management positions with Adobe, Microsoft, Autodesk, and several startups. He's consulted to Apple, Microsoft, Hewlett-Packard, Stanford University, Dell, the Pentagon, and the White House. He was a Ford-Mozilla Fellow in Media and Democracy with Common Cause in Washington, D.C., where he advocated for a free, open, and accessible internet for all, reducing media consolidation, and transparency in politics and the media.