Mining ads.txt files for Competitive Market Research
April 29, 2019 | By Alan Reed
Ads.txt is a widely support initiative to combat ad fraud. (What is ads.txt?) The objective is to increase transparency in the ad supply chain. Increased transparency makes it harder for fraudsters to avoid detection. However, there are privacy trade-offs.
Publishers often reuse publisher IDs on multiple sites. This information appears in ads.txt files and can be used to link together related publishers. The data can be used for competitive market research, to find profitable niche sites, or lead generation.
How to use ads.txt for Market Research or Lead Generation?
Finding competitor sites is simple. First, choose a domain you are interested in. Then search for the domain. You will see a list of ads.txt records on the site. Next, click on a publisher ID on that page. You will see a list of all domains using that publisher ID.
Publisher IDs are often reused by publishers on multiple sites. This is because ad exchanges like Google AdSense, AppNexus, and RubiconProject do not make it easy to generate different publisher IDs even if the publisher has different, unrelated sites.
Ad optimization companies like AdThrive, Ezoic, MediaVine, and MonetizeMore reuse the same publisher IDs on all of their clients' sites. This means you can easily find all the publishers using these services as well.
Pro Tip: Ad optimization companies only work with large, profitable publishers. So this is an easy way to generate a large list of profitable niche and authority sites. If you are a company that provides services to web publishers (like SEOs or Digital Marketing Agencies) you have just generated a stellar list of leads!
Step-by-Step Market Research
Let's pick the domain camperreport.com for this example. This is a popular authority site all about campers. Camperreport.com is made by incomeschool.com – a company that teachers others how to make profitable affiliate and ad-based websites. However, IncomeSchool does not publish the list of all of their sites or their students' sites. Follow along to see how to find many these other sites.
Step 1: Go to www.adstxt.com/publishers/camperreport.com.
This will bring you to a page with all ads.txt records for camperreport.com. Camperreport has 66 ads.txt records at this time. You can also visit their ads.txt file directly too see the records. You will also notice a comment at the top that says "# AdThrive ads.txt v2.13".
Step 2: Search by a publisher ID in camperreport.com's ads.txt file.
For this example, I chose "pub-8501674430909082". Google AdSense publisher IDs start with "pub-". I know it is difficult to generate different publisher IDs in AdSense, so it is likely to be reused. Tip: Not all publisher IDs in an ads.txt file will be reused. You may have to check multiple publisher IDs.
We can see that "pub-8501674430909082" appears on over 2,000 domains. We have a winner! If you look at the ads.txt file for a couple of the domains in this list you will see they all contain the same exact publisher IDs and even the same "AdThrive" comment at the top.
There could be false positives in this list. Anyone can put anything in their own ads.txt file. However, the majority of these domains use AdThrive to optimize their ads.
Step 3: Check domains one-by-one.
There are a lot of interesting things you can do with this list. Ad optimization companies only partner with successful publishers. So we know all of these sites are profitable. This is a great list of potential leads or ideas for a niche site.
Let's wrap up our original goal of finding other sites created by IncomeSchool. IncomeSchool has previously revealed that they also own improvephotography.com. We can check the list of domains to confirm improvephotography.com is present. It is. Now we just need to check all of these sites to see if they created by the same company. In a quick manual search of these sites we found over a dozen sites that were likely created by IncomeSchool students. An automated search would turn up many more.
List of Publisher Groups
Here are a few of the publisher networks we have found:
Questions, comments, corrections? Let us know.