Big Companies > Startups

Preview(opens in a new tab)

It was the 28th consecutive day of sun. My boss had just made two female coworkers cry over how little he paid them. Little Woodrow’s was empty because it was 3 PM on a Wednesday. We were drinking. It was part of the culture. 

“We’re going to IPO!” my boss said. As a 22nd year old reading stories of startup success in Wired, and not really grokking how the dotcom crash of 2000 would affect IPOs in 2005, or business fundamentals, I was all in. Convinced that a company doing $2M a year, growing 30% a year, losing money was going to IPO. 

I had lucked into the promised land of startups. 

A generation, or three, of Americans has entered the workforce dreaming of startup riches. Steve Jobs, Jeff Bezos, Mark Zuckerberg, Evan Spiegel, and others had created visions of wealth that were the wildcatter oil fortunes of yesteryear.

The early employees of the tech companies, though less touted, also were able to obtain vast sums of money, creating independence, happiness, and additional financial opportunities. Founders and VCs were happy to oblige the picture, talking about the unlimited financial potential of startup equity. 

It’s not all rosy. Most evangelists will acknowledge that startups are hard (depressions, founder burnout) and that 90%+ of startups fail to have a meaningful exit (not to mention the importance of understanding liquidity preferences), but that’s often pitted against other startup benefits. The ability to make a difference. That working in a corporate is soul crushing drudgery. Learning on the jobs versus learning from corporate processes.

The 9-5 corporate job boosters are few (or non-existent).

I’ve co-founded 3 companies. 2 of which eventually had an exit. I’ve worked for 4 other startups. 4 of which have exited (none have failed). And, I’ve been at one fast growing tech company for over 6 years now. 

Founders and VCs are Lying to You

I joined Indeed in 2010 when it was around 200 employees. Indeed, nearly 10 years later, there are approximately 10k employees.

I left Indeed from 2013 to 2016 to start Experiment Engine, so I can’t claim to have seen the entire transition, but it’s been pretty amazing. And it creates amazing opportunities for employees that stay.

There are numerous personal and professional benefits of being part of a larger, growing company. 


I’ve worked with a number of individuals for years. I’ve seen them grow and mature as adults. Get married, have kids, and enjoy life. On the other hand, because Indeed is growing, I’ve been able to work with individuals from some of the best big companies and less successful small companies. There are thousands of interesting people that share being an Indeedian. 

Employee Development

Larger companies establish teams of individuals focused on developing the employee base. Internal programs include training on managing, leadership, communication, data analysis, seo, marketing, and more. 

Companies also provide thousands of dollars a year (maybe $10k?) for tuition reimbursement, conferences, and other learning materials. 


Big companies might move slow, but the average employee will have multiple projects under their purview. At Indeed, this could include launching a new product, working to improve the go-to-market of an existing product in an emerging market, and continuing to position a late stage product for market leadership. 

Risk Taking

It’s easier to take personal or professional risks in a large company. Large companies typically have higher cash compensation (salary + RSUs) versus a pre-IPO company. This makes it easier to take personal risks (angel investing or working on a side hustle). 

Large companies typically have greater ability to absorb failure on employee projects, as well. This means most projects are trying to achieve the best outcome possible. 

International Travel

In order to grow, Indeed became an international company with tech offices in 4 countries and sales offices in a large number more. Connecting with co-workers might require international travel to locations like Tokyo, London, Sydney, New York, etc.

VCs and Founders will sing the siren’s call of money, making a difference, and doing something new. Just know that big companies offer all of those benefits plus everything I mentioned above.

Preventing Wasted Crawls Part 1 of Many

Googlebot loves to crawl: it’ll crawl any thing that looks like a URL, anything it can find in javascript, html, or on the page. If it looks like a URL, Googlebot will try to crawl it. Great for Google, probably great for web users because Google learns more about the web, but it can lead to wasted crawls for web owners.

As I mentioned in a postion about initial SEO decisions for The Dog Way, I blocked all category pages. I’ve loosely monitored Google’s crawls and found them just crawling any available URL: pagination, sorting by size, price, color, et al.


(For those that look at log files, you’ll notice the IPs aren’t Googlebot IPs, we’re using Cloudflare to try to speed up the site and all requests come through their IPs.)

Now I need to find all of the URLs I should have blocked, but didn’t. Very, very simple unix command: wget -O- url | grep urlpath.*\” | sort | uniq

(What I actually ran: wget -O- | grep -o dog-boots-and-shoes.*\” | sort | uniq)

Here’s what the output looks like:


To break that down.

wget -O- url: wget is a program to download files, by default it’ll save the file in the current directory. The -O- tells it to redirect to the stream output. The url is the url to download.

grep -0 urlpath.*” : urlpath in this case is the the part of the URL after the domain. In this case it was dog-boots-and-shoes.*\” (the \” is a way to escape out the ” to treat it as only one “). The ‘-o’ outputs just the text that matches and nothing else.

sort | uniq : sorts the lines and then just outputs the unique ones.

What were the key takeaways from that? I need to improve the handling of pagination, prevent Google from crawling: limit, dir=, size=, and color=. The /p/ are products and are already blocked in robots.txt. The really simple changes are to just block them all in robots. Other options are configuring URL parameters in Google Webmaster, trying to block the links out through rel=”nofollow”, or, for pagination, using rel=”next” and rel=”prev”. For right now, I’m just going with robots.txt because it’s the fastest way to fix the crawls.

Initial SEO Decisions for The Dog Way

Quick Discussion of SEO For The Dog Way

If you spend a few minutes looking around The Dog Way, you’ll notice there is almost no attempt to optimize the site for search engines. Further, almost all of the content is blocked in robots.txt, and in fact, until three week ago, the entire site was blocked. Two reasons.

The first is that since the products come from drop shippers found through there is no original content on the site.

The second is that the main focus of customer acquisition will be through social media channels, hopefully.

Having said that, there will still be some attempts at SEO, and the plan looks like this.

1. Allow bots to crawl the home page, about us, and blog.

2. Do some keyword research to decide which keywords to target through category and sub-category pages.

3. Write entertaining, good content on the category pages, and then the sub-category pages. As each page gets content, I will unblock it in robots.txt and then submit it to Google to crawl.

4. Focus on image and video optimization after that. Dogs are cute, pictures of dogs are cute, people like clicking on them, so I’m hopeful about the last tactic.

One question that comes up with blocking an entire site in robots.txt is how long does it take to get re-included and re-crawled. Turns out that Google still checks the robots.txt everyday, even when it’s blocked. You can see this by going into the logs and looking at the crawls.

Hosting for The Dog Way is done through SimpleHelix and the keep at least one day of log files on the shared, apache server under the symbolic link access-logs. Here’s how I monitored the crawl activity.

Step 1: cd access-logs

Step 2: nice grep “Googlebot” | more

This allows me to pull out all page requests from Googlebot and then page through them a few a time. SimpleHelix updates the past 24 hours and always stores it in ‘’, so there was no need to specify a date or anything else, this isn’t always the case. Grep is a unix utility for looking through files.

I unblocked Google from the homepage about three weeks ago and they started crawling other links right away. It took about 3 days for the meta content to show up for TheDogWay and it now ranks number 1 for ‘thedogway’ but just page one for ‘The Dog Way’.

Getting SEM Keyword Data From Apache Log Files

The use of adwords with the current form of The Dog Way is difficult because products are supplied through drop ship wholesalers that provide limited inventory and very small margins, but, on the plus side, no need to worry about fulfillment or inventory costs! Because the average gross product might be around $10, and I’m hoping the average margin per order will be around $20, there isn’t much room to bid. The CPCs I’m seeing right are $1.50. meaning a hopeful, break-even conversion rate of 7.5%, which is not the case – yet.

Lack of budget, lack of expected success, and a very limited product suite present some challenges. Couple that with a new domain, no relevant click-through-rate history, and the options of keywords I can profitably bid on became pretty limited.

I had initially planned on targeting very long tail phrases: add in the list of products, clean up the titles, and those titles became the keywords. That hit a wall when it became clear that inventory from Doba could change randomly and that Google would not enter keywords with insufficient search volume into the auction (which means the product names I had planned on targeting). Although, I imagine it’d still be possible to pick up those search queries through broad match somehow.

I did see a value on SEM though to generate a list of keywords to target for SEO,  and though there are other, cheaper ways to get these lists, which I’ll go over later, they don’t give an indication of conversion rates and usability stats.

I ended up making two very simple campaigns targeting ‘Dog Clothes’ and ‘Dog Coats’ and then the adgroups targeted the sub-categories. I put together about 5 keywords per ad group using broad match for each. I set a small budget and then ran it to see what would happen.

Of note:

-Most of the keywords did not end up showing, Google said the keywords were either too similar to other keywords, for example: ‘Affordable Dog Coats’ and ‘Affordable Dog Jackets’ and then that some were too low volume. Out of the roughly 50 keywords I entered, only 3 generated traffic. I wanted to see the actual search queries, but Google claimed there was insufficient volume to show those. This where grep comes in handy again.

I went to the access-logs file mentioned earlier and did a few other commands to specifically get the search queries. The command looked roughly like this, and then I’ll break it down.

grep -E “glcid|aclk” log_file.txt | awk ‘{print $11} | awk -Fq\= ‘{print $2}’ | awk -Fsource= ‘{print $1}’ | sed -e ‘s/%20/ /g’ | sed -e ‘s/”//g’ > ~/search_terms.txt

When reading anything that looks like code from me, please see my general disclaimer that basically says, “I’m a business guy, not a coder.”

The parts

grep -E “glcid|aclk” log_file.txt – Instructs the utility grep to look through the log file for instances of ‘glcid’ or aclk’, the parameters I’ve seen for adwords, and pulls out those lines.

The ‘|’, or pipe, is a way of outputting one commands output into another’s input through.

awk ‘{print $11}’ – awk is another unix utility for working with files, it’s very handy. Awk is very useful for looking at columns and uses whitespace as the column delimiter by default. The ‘{print $11}’ is a command to print just column 11, which, is the referring string, or what URL just sent the user to the landing page.

awk -Fq\= ‘{print $2} – The default column delimiter for awk is whitespace by default, but it’s possible to specify another delimiter using -Fpattern. In this case I used ‘q=’ because thats where Google puts the search query, but notice the ‘\’. The ‘\’ is a way to escape out a character and prevent it from being used as a special character. Once I specified the delimiter as ‘q=’, the referring string gets broken into two. The part I want, with the search query, is in column 2.

awk -Fsource= ‘{print $1}’ – Another use of the delimiter because there is still some part of the URL on the search query I don’t want. I basically used the same trick as a move to get down to just the search query.

sed -e ‘s/%20/ /g’ – Sed is a unix stream editor, another handy utility. The -e tells sed to edit the stream. The next part is essentially a find and replace. I am replacing the ‘%20’ with a ‘space’ to clean up the formatting. The ‘g’ at the end is a specification to make it global, or on all instances. The basic sed replace structure is this:

-sed ‘s/string_to_replace/new_string_to_enter/numberofinstancestoreplace’

sed -e ‘s/”//g’ – That gets rid of any quotation marks on the string, and we’re left with “ “.

> ~/search_terms.txt – Instead of piping the previous output to another command, we’re redirecting the output to a file. The ~/ specifies my home directory and then the file name is search_terms.txt.

And now we have a list of keywords looking like this:

benfica dog clothing

cheap pet clothes for small dog

cheap designer dog clothes

cheap designer dog clothes

dog clothes

clothyes for dogs



I’ll write another blog post about a couple of simple ways to clean up the file, you can you put it in excel to dedup, sort, count, et cetera.


Blog Intro

Welcome to the first post of Ad Free Marketing! This blog is for online marketers, or for people that like short, stilted sentences mixed in with a few typos, grammatical errors, and the occasional run on sentence. Take your pick.

I, with my incredibly talented partner, recently launched a side-passion to experiment with social commerce,  keep exploring new areas of online marketing, and more importantly, make a more enjoyable shopping experience. This blog will document the steps I take to promote customer acquisition and retention. If we’re lucky, Claire will start talking about the design and UX decisions she makes for; she is also the one doing all of the front and back-end coding.

Why You Might Want To Read This Blog

The steps I take to promote The Dog Way will be chronicled here, including successes and failures. My goal with the blog is to share what works, what doesn’t, and to provide data on how the decision was made. The hope is that each blog post will provide some value to the average online marketer.

Who I Am

You can check out my LinkedIn Profile or my Indeed Resume here to get a sense of where I’ve worked, but at the end of the day, I’m an Austinite into web commerce, data stuff, and startups. I’ve worked in business development, product, analytics, and online marketing roles.

Feel free to reach out to me with question through my Indeed Resume or leave a comment.


EJ Lawless