# Advanced Search Query Mining Part 3: Preparing Your Data

In part 2 of this series, we gathered all of the data we would need for our search query mining exercise into Excel, as seen below.

Now, we must take the time to prepare our data for analysis. This will include creating derived fields to bring information to the surface, flagging and deleting noise, converting counts to proportions, etc. We are going to use the power of Excel to our advantage and push our data to its limits to extract value.

Here are some of the questions our data will need to be able to answer easily:

**What search queries have high impressions but no clicks?**(might be a good negative candidate)**What search queries have resulted in a conversion?**(promote these to exact match keywords in your account).**What search queries have a below average CTR for the ad group?**(negative candidate)**What search queries have an above average cost per conversion?**(might need a new ad/landing page or may be negative candidates)

## Step #1: Format your data as an Excel Table

1. Highlight your entire SQR dataset including the header row.

2. Click on "Format as Table" from the Home ribbon

3. Choose a style that you like

4. Click "OK"

5. Repeat these steps for both the ad group and account keyword structure datasets

## Step #2: Flag Duplicate Search Queries

Duplicate search queries are broad and phrase match search queries that are already being targeted in your account as an exact match keyword. In other words, your broad and phrase match keywords might be ad-poaching your exact match keywords.

1. Go to your AdWords Editor keyword data and cut column C (keyword) and insert it at column A so that "Keyword" is your first column.

2. Create a new column "DUP_CK" in column "N" in your "sqr_data" tab. Because you formatted your dataset as a table, all you have to do is type the word "DUP_CK" in cell "N1" and a new column will be created.

3. Type this formula in "N2" (or use the formula wizard):

**=VLOOKUP(Table1,editor_kw_data!A:C,2,FALSE)**

4. After Excel is through processing, copy all of column "N" and right click then "Paste Special > Values" back into column "N". This way every time you change something in Excel it doesn’t start processing through your data again (can be annoying if you have 100K search queries).

## Step #3: Word Count Column

We will use a search query word count later in the analysis process while we are surveying our data.

1. Create a "WORD_COUNT" column in cell "O1" just as before.

2. Type this formula in cell "O2": **=****LEN(A1)-LEN(SUBSTITUTE(A1," ",""))+****1**

## Step #4: Create a search query peer CTR column

Having a CTR comparison (or reasonable CTR expectation) will be very valuable for our analysis. In order to calculate a "Peer CTR" we must subtract the search queries clicks and impressions from the ad group’s total clicks and impressions and then calculate a CTR.

1. Create a "PEER_CTR" column in cell "P1"

2. Type this formula in cell "P2":

**=(VLOOKUP(Table1,adgroup_data!A:L,4,FALSE)-Table1)/(VLOOKUP(Table1,adgroup_data!A:L,5,FALSE)-Table1)**

**Friendly formula: (ag_clilcks – sq_clicks)/(ag_imp – sq_imp) = PEER_CTR**

## Step #5: Create a search query peer conversion rate column

Like "PEER_CTR", a comparison (expected conversion rate) is a great metric to have during your search query analysis. You create it very similarly to how you just created "PEER_CTR".

1. Create a "PEER_CVR" column in cell "Q1" (CVR = conversion rate)

2. Type this formula in cell "Q2" similar to the "PEER_CTR" formula:

**=(VLOOKUP(Table1,adgroup_data!A:L,10,FALSE)-Table1)/(VLOOKUP(Table1,adgroup_data!A:L,4,FALSE)-Table1)**

**Friendly formula: (ag_conv – sq_conv)/(ag_clicks – sq_clicks) = PEER_CVR**

## Step #6: Create a search query peer cost/conversion column

In order for us to later look for a high CPA search query, we must have an average or expected CPA to compare against.

1. Create a "PEER_CPA" column in cell "R1" (CPA = cost per conversion)

**=(VLOOKUP(Table1,adgroup_data!A:L,8,FALSE)-Table1)/(VLOOKUP(Table1,adgroup_data!A:L,10,FALSE)-Table1)**

## Step #7: Create a needed impressions column

In order to answer the question, "How many impression are needed before we should expect to see clicks?" we must calculate a needed impressions metric. We do this by dividing the number of clicks you would like to see by the peer CTR.

Of course, you could use a more "statistically significant" approach as I described in my PPC analysis post. The point is to create a flag that can be used to filter out search queries with sparse data later in our analysis.

1. Create a "NEEDED_IMP" column in cell "S1"

2. Type this formula in cell "S2": **=****1.5/Table1**

## Step #8: Create a needed clicks column

In order to answer the question, "How many clicks are needed before we should expect to see conversions?" we must calculate a needed clicks metric. We do this by dividing 1-2 clicks by the peer CTR.

1. Create a "NEEDED_CLICKS" column in cell "T1"

2. Type this formula in cell "T2": **=****1.5/Table1**

## Step #9: Create an enough impressions column

In this step we are going to use the "NEEDED_IMP" column that we created earlier with an "IF" statement to set a flag that we can use as a filter later in our analysis.

1. Create an "ENOUGH_IMP" column in cell "U1"

2. Type this formula in cell "U2":

**=IF(Table1>Table1,"Y","N")**

## Step #10: Create an enough clicks column

Again we are going to use a derived field that we created earlier, "NEEDED_CLICKS" to set a flag for "ENOUGH_CLICKS".

1. Create an "ENOUGH_CLICKS" column in cell "V1"

2. Type this formula in cell "U2":

**=IF(Table1>Table1,"Y","N")**

## Step #11: Create a low CTR column

This is a very important filter that we will use later that helps us focus on search queries with below average CTRs. The formula described below will set a flag for "LOW_CTR" for search queries with a CTR that is 30%+ below its peers in that ad group.

1. Create a "LOW_CTR" column in cell "W1"

2.** **Type this formula in cell "W2":

**=IF(Table1<(Table1*0.7),"Y","N")**

## Step #12: Create a low conversion rate column

Just like low CTR, a below average conversion rate can be an attribute of a negative keyword candidate.

1. Create a "LOW_CVR" column in cell "X1"

2. Type this formula in cell "W2":

**=IF(Table1<(Table1*0.7),"Y","N")**

## Step #13: Create a high cost per conversion column

A high CPA can also offer insights into potential problems with a search query. The search query may perform better with a different landing page or perhaps it needs to be added as a negative keyword. The formula below sets a flag for search queries that have a CPA 30%+ above its peers.

1. Create a "HIGH_CPA" column in cell "Y1"

2. ** **Type this formula in cell "Y2":

**=IF(Table1>(Table1*1.3),"Y","N")**

This might seem like a lot of work, but once you get familiar with these techniques it should only take you about five minutes to create all 12 columns.

I also encourage you to experiment with creating your own derived fields for your analysis. What I have described above comes from much trial and error (with my own data), and is by no means the only way you could/should prepare your data. Start by formulating the questions you will be asking of your data. This will help you figure out what new fields, flags, etc. you will need to answer those questions.

I look forward to my next post on advanced search query mining, in which I will show you how to start mining your prepared data for insights. We will be creating a negative candidates list as well as a keyword expansion list.

*This is a guest post by Chad Summerhill, author of the blog PPC Prospector, provider of **PPC tutorials**, and **in-house AdWords specialist** at Moving Solutions, Inc. (**UPack.com** and **MoveBuilder.com**).*

## Comments

Chad, Excellent idea, However, when I try to use these formulas, I cannot get half of them to work. The biggest problem is when I look for my "#this row" I don't have it, instead I only have "@- This Row" Whats going on here

Yes I also cannot get these formulas to work and I am surprised there isn't more feedback on this post?

There seems to be two issues with the formulas above. First, in Office 2010 and later, they have changed the #this row notation to use the @ symbol instead. This just cleans up the formula and doesn't change the functionality. More info about that here: http://blogs.office.com/b/microsoft-excel/archive/2010/01/25/table-improvements-in-excel-2010.aspxSecond, Adwords has updated its interface since this was written and the data exports seem to have changed. This throws off the cell references in all the vlookups. I'm trying to figure out what data I'm supposed to be finding but its tricky so far. If anyone else has more insight, it would be appreciated by all.

Quick update, I just spent time digging through the formulas and found its pretty easy to tweak them to meet my data. You just have to pay attention to the friendly formulas listed above. And make sure your table number is correct. For example, my PEER_CTR formula looks like this instead of what is listed above:=(VLOOKUP(Table3[[#This Row],[Ad group]],adgroup_data!B:J,8,FALSE)-Table3[[#This Row],[Clicks]])/(VLOOKUP(Table3[[#This Row],[Ad group]],adgroup_data!B:J,9,FALSE)-Table3[[#This Row],[Impressions]])

Wow, thanks for taking the time to post this John! We haven't had a chance to update the formulas in the post.

No problem at all. One more thing I've noticed, is the DUP_CK column is no longer necessary since the new SQR from Adwords includes a column telling you whether or not a keyword has been added, excluded or neither. So to see keywords that definitely aren't in your account, just filter that column by None and you're all set.

## Leave a comment