Some weeks seem fated to go bad. I started last week getting stung by a scorpion and then I learned a new term, ISP cache. The scorpion sting stopped hurting long before I understood what kind of grief the ISP cache was causing me.
When my aunt asked me why I was blocking her on my website, I was fairly sure she was wrong. This wasn’t the first time she had problems with my site. About six months ago when I started using Wordfence she complained that I was blocking her. I checked the Wordfence forums to see if anyone else was having problems but there was nothing about random people being blocked. I assumed I had a setting in Wordfence wrong so I cleared all the blocked IP addresses and she was able to see my website again.
Two weeks ago, when I was being flooded by zombies trying to log into my website, I bumped my security up a notch by modifying my .htaccess file. I’ll be the first to admit that I barely understand the script I used. I did a cut and paste to make sure I wasn’t introducing any problems. My readership isn’t large enough that I can afford to block anyone coming to actually read my posts.
Apparently the changes I made caused her to be classified a webcrawler and were denying her access. I studied the changes I had made, decided I didn’t understand the new .htaccess script well enough to be sure so I went back to the old script and had her try again. She tried again and was still blocked.
To add to the mystery, her attempt to reach my site did not show up in my logs. With the .htaccess script active, she would be blocked before I logged the attempt but once I removed the script, I should have seen her attempt in my logs. This made no sense at all.
Fortunately my Aunt enjoys a good mystery and was willing to help. The next day she tried accessing various pages on my site using three different browsers. All three browsers were blocked from my front page but she could go directly to any other page. My log files were no help. They were showing her coming from pages that I had no record of her visiting. I was only seeing a third of her visits.
It was while I was in the thought stimulation chamber (shower) that the pieces fell into place. Somehow my webpages were being cached and she was seeing the cached results of someone else’s visit. Either she was using an Internet accelerator that used a common caching scheme or her Internet provider was using an ISP cache.
Caching is certainly nothing new. If you’re a provider with limited bandwidth, it makes sense to cache all the requests that come through and only download new pages you don’t have in cache. I discussed this with my Aunt, confirmed that she was not using an internet accelerator and gave her a way to test the theory.
Sure enough, if she tried to enter my site with Opinionbypen.com she was blocked. Today’s message was that her browser was too old. I started blocking IE6 and below long ago when I saw the correlation between IE6 and visitors trying new exploits on my site. My Aunt was getting the same message using Chrome, IE9 and Firefox. She was able to visit my site if she used Opinionbypen.com/?=123. The ?=123 looked like a new page to the cache manager and my site ignores it.
Sure enough, her Internet provider was using an ISP cache. Under other circumstances this would have been transparent but since someone else, using the same provider, was being blocked by my site, her provider was serving her the blocked message instead of my website.
This also explained how she was able to visit my website without being in my logs. Her provider was not requesting new pages from my site, they were already in the ISP cache. Understanding this explained a lot of the anomalies I would see when looking through the logs.
I don’t know how long her Internet provider caches pages but after the cache expired, the sequence of events would go something like this. If the attempted hacker were the first person to visit my site they would get a blocked message and the ISP cache handler would assume it was a valid response. The next time someone visited my site the cache handler would not query my site, it would instead send the same blocked message. If my aunt were the first visitor, everyone else for that day would see the same valid webpage that she had seen, all thanks to the magic of ISP caching.
Caching also explained why I wasn’t seeing all her visits. Because her ISP only loaded my pages once each cache period, I would only see her visit a page once a day even if she went back to my home page several times. Even then, I would only see it in my logs if she were the first visitor through her ISP to that page.
Only one mystery remained. Who was this person causing so much grief for me? Knowing they were using IE6 and which pages were blocked to my Aunt should allow me to pinpoint my visitor. It was easy but the only IP address using IE6, visiting the sites that were blocked, is registered in Germany. There’s no question that it was a direct match but my Aunt doesn’t live anywhere near Germany nor should German traffic come through her Internet provider.
I’ll report the issue to her provider and let them chase that mystery. I’m happy, my Aunt knows I wasn’t blocking her and I understand my log files a lot better.Opinion by pen