I use Piwik for my website analytics. I like that I host and control the data and it’s a learning experience. It’s not as full featured as Google Analytics but it’s more than I need for my little website.
I can get an email every morning showing my sites traffic for the day before. This email has always had a problem of often not having the previous day or days stats. I finally determined it was because the archives were only generated when someone (like me) viewed the reports. And the email came from this data. I ignored this problem because there was also an iOS app that I could use to check stats and it was easier than email. So I typically ignored this email even when I hadn’t turned it off.
But then I was ignoring stats for awhile and those “0 visitor” emails didn’t cause any alarm. Unfortunately I had a website problem that went on for longer than it should. So it was time to get those emailed stats working.
I’m running Piwik on Debian and Apache. Everything is on the latest versions. I figured my solution was to implement their stat archiving script in a cron job. This is recommended for larger sites to limit the impact of compiling the stats.
Their instructions for setting this up are pretty good so I won’t repeat everything here, just add some comments on the issues I had.
Curl is needed. I installed the php5-curl package.
sudo aptitude install php5-curl
Rather than schedule a job through crontab I just added the script file to the /etc/cron.hourly directory.
The script contains three lines, only one of which does anything important:
#Generate Piwik Stats – Everything below is on one line
/usr/bin/php5 /path/to/piwik/misc/cron/archive.php –url=http://example.org/piwik/ –accept-invalid-ssl-certificate > /home/example/piwik-archive.log
Most of the parameters are described in the Piwik documentation. I added this one to deal with my self-signed SSL certificate. This is not the best security choice but I figure the risk is minimal for me.
Don’t forget,like I did, to make the script executable: chmod +x /etc/cron.hourly/scriptfile
Hourly updates are more than I need but at least I don’t have to worry about coordinating the email and stats generation