How to Create Search Friendly PDFs.

A white, rounded square logo with an abstract S-shaped design on a gradient blue and purple background.

A question from our readers:

“How would” I create a text version of these documents so that they can be crawled by search engines?”

Easy. Google already indexes PDFs. In fact, Google can index just about anything these days. See a chart here.

  • Adobe Portable Document Format (.pdf)
  • Adobe PostScript (.ps)
  • Atom and RSS feeds (.atom, .rss)
  • Autodesk Design Web Format (.dwf)
  • Google Earth (.kml, .kmz)
  • Lotus 1-2-3 (.wk1, .wk2, .wk3, .wk4, .wk5, .wki, .wks, .wku)
  • Lotus WordPro (.lwp)
  • MacWrite (.mw)
  • Microsoft Excel (.xls)
  • Microsoft PowerPoint (.ppt)
  • Microsoft Word (.doc)
  • Microsoft Works (.wks, .wps, .wdb)
  • Microsoft Write (.wri)
  • Open Document Format (.odt)
  • Rich Text Format (.rtf)
  • Shockwave Flash (.swf)
  • Text (.ans, .txt)
  • Wireless Markup Language (.wml, .wap)

Now from an SEO perspective, we have found the following to be helpful. In order to give them a boost, you may want to do one of the following:

  1. Create a new page summarizing the PDF’s contents and then link to the PDF itself. The theory is that the summary will add additional keyword phrases to index and by linking to the article, you will increase the documents authority.
  2. Rename the PDFs to be keyword phrases you want to rank high for (i.e. instead of article1.pdf it becomes “real estate how to guide.pdf.”)”  The theory is that the file name gives Google another indicator about what the article is about AND all links to the file will contain keyword phrases.
  3. Alternatively, you could cut/paste the entire PDF as HTML text. The theory is that HTML documents tend to rank higher than PDF documents. You would create links to the PDF internally on the site and from other web sites. This will give the document more authority and rank it higher in the search engines. Make sure that you do not copy and paste the entire PDF document AND have the PDF indexed. That is technically duplicate content. You may want to indicate to Google to not index the PDF.You can do this by disallowing the PDF folder of all files.
Share This

Join the Sales and Marketing News, receive our last insights, tips and best practices.

Our 7 Guarantees

Keeping 2,000+ Clients Happy Since 2001.

You Will Love Your Design We design to please you and your clients
Same-Day Support 24-hour turnaround edits during business hours
Free Education We provide knowledge to help you expand
No Hidden Charges We quote flat-rate projects
Own Your Site No strings attached
We Create Results SEO, PPC, content + design = clients
We Make Life Easier One agency for web, branding and marketing