So if you’re like me, and you’re paranoid about data loss, you’re probably using something to back up your data. Whether that something be Ghost, Carbonite, Iron Mountain, or simple file copies, you are always making sure you’re data is protected one way or another. Now, I personally use a combination of things. I use Norton Ghost to do full drive images every other day and copy them out to an external hard drive. I also use MozyHome for my most important files (financial data, some source code, school work, etc.).
If you use MozyHome, and have ever seen this message for an un-explained reason, keep reading:

This error, according to Mozy support, means that the temporary location for caching the backups is full. Now, that may be the case for a lot of users, but it wasn’t for me. You see, Mozy is SUPPOSED to use the Windows environment variables to determine where to store its temp files (See the %TEMP% variable). In my case, this variable is set to %SystemRootTemp which is C:WindowsTemp. Where was MozyHome storing its cache? Try G:TEMP. I couldn’t figure out why this is the case, but now that I know, I’ll be sure not to delete it.
So before I realized all of this, I did a little “house cleaning” a few days ago and G:TEMP happened to be one of the directories which I removed completely. Then today I figured, maybe I should make sure my files are backed up (because I like getting warm and fuzzy feelings). So I look and sure enough, they weren’t. I did some hunting into this error and discovered all of the above mentioned information.
I have since sent a response to MozyHome’s KB article on this error message explaining that this error can crop up if there is plenty of space, but the TEMP directory is simply missing. I then proceeded to jab them with a stick by saying that their backup software was completely ignoring Windows environment variables. Then again when I told them “Why not just create the temp directory if it doesn’t exist?” They shouldn’t be making their software complain about free space when that has nothing to do with what is going on. It is a very misleading error and they need to fix it. This error sort of seems like a “cach-all”.
I don’t really have any sections on my website related to MySQL, so I figured I’d just throw together a quick post about something I found incredibly helpful. I’m sure others will find it nice too.
So in my years being trained on data architecture and database design, I learned to "normalize, normalize, normalize". Unfortunately, normalization doesn’t lend itself to very simple statistical calculations–such as those used to generate graphs or a statistical scorecard.
In the case of Angel, I’ve normalized the hell out of the data. Now it comes down to needing to report on that data. At first, my reporting attempts were terrible–tons of nested loops, many queries, etc. Well, when I spoke to a colleague of mine at work, he said that if I was running on an SAS Mainframe, the work would already be done for me using a function called "PROC TABULATE". He sort of dated himself by saying that.
Then I broke the news to him: I’m not using SAS, or any other mainframe technology. He replied by suggesting I look into some sort of MySQL equivalent to that function. So I started hunting. I came across this very nice article: http://dev.mysql.com/tech-resources/articles/wizard/print_version.html
That article shows, very effectively, how to create a nice "statistical" report using cross-tabulation techniques with normalized source data. The queries are sort of monstrous if you have a large number of fields to report on, but at that point, you’d be using code to generate the parameters for the SELECT statement.
Here’s my example:
My source data looks like the following–please note that some results are truncated because the result set would be WAY too large for a blog:
mysql> select uid, Email from players;
+-----+--------------------------+
| uid | Email |
+-----+--------------------------+
| 10 | [email protected] |
| 4 | [email protected] |
| 1 | [email protected] |
| 9 | [email protected] |
| 11 | [email protected] |
| 5 | [email protected] |
| 8 | [email protected] |
| 12 | [email protected] |
| 6 | [email protected] |
| 7 | [email protected] |
+-----+--------------------------+
mysql> select * from gamedata LIMIT 10;
+----------+---------+------+
| playerid | fieldid | data |
+----------+---------+------+
| 1 | 3 | 60 |
| 1 | 4 | 1 |
| 1 | 5 | 1 |
| 1 | 6 | 286 |
| 1 | 7 | 1 |
| 1 | 8 | 13 |
| 1 | 9 | 1 |
| 1 | 10 | 1 |
| 1 | 11 | 261 |
| 1 | 12 | 13 |
+----------+---------+------+
mysql> select FieldID, FriendlyName, GameID, FieldTypeID FROM gamefields;
+---------+---------------------------------+--------+-------------+
| FieldID | FriendlyName | GameID | FieldTypeID |
+---------+---------------------------------+--------+-------------+
| 1 | Play Time | 1 | 2 |
| 2 | Last Character | 1 | 1 |
| 3 | Total Eggs | 4 | 2 |
| 4 | Total Playtime | 4 | 2 |
| 5 | Total King Of The Hill Captures | 4 | 2 |
| 6 | Total Kills | 4 | 2 |
| 7 | Total Deaths | 4 | 2 |
| 8 | Engineer Playtime | 4 | 2 |
| 10 | Mongo Playtime | 4 | 2 |
| 11 | Grunt Playtime | 4 | 2 |
| 12 | Gladiator Playtime | 4 | 2 |
| 13 | Total Powerup Uses | 4 | 2 |
| 14 | Total Powerups Received | 4 | 2 |
+---------+---------------------------------+--------+-------------+
mysql> select typeid, typename from field_types;
+--------+------------------+
| typeid | typename |
+--------+------------------+
| 1 | Text |
| 2 | Numeric |
| 3 | Game Coordinates |
| 4 | Boolean |
| 5 | Physical Address |
| 6 | IP Address |
+--------+------------------+
What I want to know from that data is how many players have field values within specific ranges–to generate some input data for a chart output. The code I’m using does the following math (from a separate query):
$StandardDeviation = floor( ($High - $Low)/NUM_DEVIATION_RANGES );
for $i = 0 to NUM_DEVIATION_RANGES
$Min = $Low + ($StandardDeviation * $i)
$Max = ($Min + $StandardDeviation) – 1;
$Query .= “SUM( IF( CAST(gamedata.Data AS UNSIGNED) >= $Min AND CAST(gamedata.Data AS UNSIGNED) <= $Max, 1, 0) ) AS ‘$Min-$Max’,”;
end for
The output of the query from step 4 is as follows:
Now that query runs once per number of fields requested. The above query was just a single query for a single field. Note how the code generated the SELECT statement parameters (min and max) and set them as the column header in the result?
The result of that query looks like this:
+-------+--------+---------+---------+---------+---------+---------+---------+---------+---------+ | 60-87 | 88-115 | 116-143 | 144-171 | 172-199 | 200-227 | 228-255 | 256-283 | 284-311 | 312-339 | +-------+--------+---------+---------+---------+---------+---------+---------+---------+---------+ | 2 | 0 | 2 | 0 | 1 | 0 | 2 | 0 | 1 | 0 | +-------+--------+---------+---------+---------+---------+---------+---------+---------+---------+
So the output is exactly what I need for my chart. The result fields represent the X axis of the chart, and the result values represent the Y axis of the chart, all from a single query against normalized data using cross-tabulation.
I urge you to click on the link I mentioned near the top of the article if you want to learn how to do this. My example is fairly specific to my situation. That article showed me exactly what I needed to do, but I needed to apply it to my data and my reports. You’ll notice that I’ve used CAST’s in my query. The reason being that in order to accept any type of data, I’ve used the ‘TEXT’ type on my game data fields–that’s why I also have a type table and a type ID in the field types. I then query what types of fields is numeric, and only perform my calculations against them.
Doing cross-tabluation/pivot tables isn’t too bad once you know what you’re doing. I had a tough time wrapping my head around it, but after a few hours of tinkering, I finally got what I was looking for. Another thing to note is that I’m summing up 1′s and 0′s, basically a tally. If you want to sum up your real data, don’t use 1′s and 0′s; use your actual data field.
So, I go to download the latest version of the Net Beans IDE this morning to start teaching myself how to consume a web-service in Java. I use Download Accelerator Plus as a download manager because my ISP sucks and I can only pull about 100kbps down per connection, thus I need something that opens multiple connections such as DAP.
Anyway, I start downloading it and since I didn’t PAY for DAP, I have to see all these stupid ads on my screen whenever I download a file. Well today, one of the ads was the following (highlighted in green):

[Begin arrogant jackass]
So, if I know who the president is, I can get a green card. HOLY CRAP! I’m glad this is just a gimmick ad, or this country would be full of immigrants. Or… wait a minute…. it already IS! In fact, I really don’t have a problem with immigration, I just found it funny that someone is using an ad like that to try to (eventually) get personal information from someone by tricking them into thinking their simply choosing the president. When in fact, it’s not a choice. The whole thing is a single image, with one link; presumably to a form where you enter personally identifiable information or sign up for other bogus ads.
I just find that kind of crap mildly entertaining because there are actually people on this earth who believe that garbage.
I don’t really have much of a purpose to this post… most of a service announcement to any gullible individual out there who is naive to the scams on the Internet. Not only are they scams, but probably loaded with viruses and spyware and other crap you really don’t want ever touching your computer. I come across too many people, too many times, who have clicked on something they shouldn’t have, and gotten themselves screwed by a virus, malware, or root kit.
Maybe I should start charging for a class that people can take. It would only be 2 minutes long. The first minute of me introducing myself and telling them why they should listen to what I have to say. The second minute being a full minute of me repeating “Don’t click that!” followed by a swift kick to the back of the head. The world would be a smarter place.
[/end arrogant jackass]
So I’ve been toying around with Amazon’s web services lately. Amazon offers a lot of incredibly beneficial services that you can purchase on a pay-as-you-go basis. Specifically, I’ve been using EC2 and S3 which sort of go hand-in-hand if you built your own Amazon Machine Images (AMIs). Over at the Venom Game Labs website, I made a post about how we started using Amazon EC2 to host Angel. I’m really impressed at the performance of these virtual machines–which are Xen-based on (I beleive) a Red Hat Linux distribution. I built a few Fedora 8 machines and a Windows 2003 R2 Datacenter machine as well. All of which performed nicely.
If you’re looking for a test server for your application and don’t have any hardware laying around your house, I highly recommend Amazon EC2. It takes a bit to get used to at first, but once you get the hang of it, it’s easy going.
A tip for newbies: If you ever hit the “Terminate” button on one of your instances, you lose all your stuff. Don’t hit that button unless you have built your own custom AMI or have your data backed up somewhere else. For the purposes of Angel, I started with two base Fedora 8 machines and built one as an Apache HTTP server and the other as a MySQL 5.1 server. Then I built a Windows 2003 R2 Datacenter machine to run the web-services portion of Angel (since its written in C#). Once I had them the way I wanted, I bundled them up into AMI’s and then terminated my instances so I’m not paying for resources I’m not actively using. When I need them again, I can just deploy from the AMI, change a few settings–mainly with IP addresses and database connection addresses–and I’m up and running again in a matter of minutes without needing to re-install my software and restore my data.
Some useful articles (from Amazon’s documentation):