GeoIpCountryWhois.csv - Add one string.

Problems compiling? Don't understand the source code? Don't know how to code your feature? Post here.

Moderator: Moderators

Locked
Zinden
Posts: 11
Joined: 2004-12-31 11:47
Location: Sweden
Contact:

GeoIpCountryWhois.csv - Add one string.

Post by Zinden » 2006-06-05 12:01

Code: Select all

	try {
		// This product includes GeoIP data created by MaxMind, available from http://maxmind.com/
		// Updates at http://www.maxmind.com/app/geoip_country
		string file = Util::getDataPath() + "GeoIpCountryWhois.csv";
		string data = File(file, File::READ, File::OPEN).read();

		const char* start = data.c_str();
		string::size_type linestart = 0;
		string::size_type comma1 = 0;
		string::size_type comma2 = 0;
		string::size_type comma3 = 0;
		string::size_type comma4 = 0;
		string::size_type lineend = 0;
		CountryIter last = countries.end();
		u_int32_t startIP = 0;
		u_int32_t endIP = 0, endIPprev = 0;

		for(;;) {
			comma1 = data.find(',', linestart);
			if(comma1 == string::npos) break;
			comma2 = data.find(',', comma1 + 1);
			if(comma2 == string::npos) break;
			comma3 = data.find(',', comma2 + 1);
			if(comma3 == string::npos) break;
			comma4 = data.find(',', comma3 + 1);
			if(comma4 == string::npos) break;
			lineend = data.find('\n', comma4);
			if(lineend == string::npos) break;

			startIP = Util::toUInt32(start + comma2 + 2);
			endIP = Util::toUInt32(start + comma3 + 2);
			u_int16_t* country = (u_int16_t*)(start + comma4 + 2);
			if((startIP-1) != endIPprev)
				last = countries.insert(last, make_pair((startIP-1), (u_int16_t)16191));
			last = countries.insert(last, make_pair(endIP, *country));

			endIPprev = endIP;
			linestart = lineend + 1;
		}
	} catch(const FileException&) {
	}
}
As you all can see it only reads 4 comma, but the geoip got 5 comma in file like:

This is an sample of how the CSV file is structured:

"begin_ip","end_ip","begin_num","end_num","country","name"
"61.88.0.0","61.91.255.255","1029177344","1029439487","AU","Australia"
"61.92.0.0","61.93.255.255","1029439488","1029570559","HK","Hong Kong"
"61.94.0.0","61.94.7.255","1029570560","1029572607","ID","Indonesia"

Code: Select all

/*	getIpCountry
	This function returns the country(Abbreviation) of an ip
	for exemple: it returns "PT", whitch standards for "Portugal"
	more info: http://www.maxmind.com/app/csv
*/
string Util::getIpCountry (string IP) {
	if (BOOLSETTING(GET_USER_COUNTRY)) {
		dcassert(count(IP.begin(), IP.end(), '.') == 3);

		//e.g IP 23.24.25.26 : w=23, x=24, y=25, z=26
		string::size_type a = IP.find('.');
		string::size_type b = IP.find('.', a+1);
		string::size_type c = IP.find('.', b+2);

		u_int32_t ipnum = (Util::toUInt32(IP.c_str()) << 24) | 
			(Util::toUInt32(IP.c_str() + a + 1) << 16) | 
			(Util::toUInt32(IP.c_str() + b + 1) << 8) | 
			(Util::toUInt32(IP.c_str() + c + 1) );

		CountryIter i = countries.lower_bound(ipnum);

		if(i != countries.end()) {
		return string((char*)&(i->second), 2);
		}
	}

	return Util::emptyString; //if doesn't returned anything already, something is wrong...
}
all code is located in util.cpp file.

What im wondering about is someone can remake the code so it doesnt return country code, but returns instead the full name of the country.
Last edited by Zinden on 2006-06-08 14:34, edited 4 times in total.

BSOD2600
Forum Moderator
Posts: 503
Joined: 2003-01-27 18:47
Location: USA
Contact:

Post by BSOD2600 » 2006-06-05 22:43

displaying the full country name will take up too much space in the columns. The abbreviation is just fine. If you don't know what an abbreviation means, look it up.

Zinden
Posts: 11
Joined: 2004-12-31 11:47
Location: Sweden
Contact:

Post by Zinden » 2006-06-06 02:14

thats not a problem, that it takes bigger space.

So i hope someone knows how to remake the code......

poy
Posts: 83
Joined: 2006-04-03 15:55

Post by poy » 2006-06-06 08:12

your code is... well... weird... :shock:

i don't get where exactly the Util::getIpCountry function returns something else than Util::emptyString
moreover i don't even know how you can get this to compile because you will notice there are 2 { and 3 } :roll:

Pothead
Posts: 223
Joined: 2005-01-15 06:55

Post by Pothead » 2006-06-06 09:08

Yup, poy, that confused me as well. Until i looked in Util.cpp :)

poy
Posts: 83
Joined: 2006-04-03 15:55

Post by poy » 2006-06-06 09:39

hehe, just a copy-paste thing :P

to begin i would suggest you add a comma5 the same way comma4 is added, then you change start + comma4 + 2 to start + comma5 + 2. maybe you will also need to change the type u_int16_t to a bigger one..

and in Util::getIpCountry remove the ", 2" in the return thing, so that not only the 2 first letters are returned.

but there may still be things to change :wink:

Zinden
Posts: 11
Joined: 2004-12-31 11:47
Location: Sweden
Contact:

Post by Zinden » 2006-06-06 11:23

im almost there....still got some problems to increase letters...still show only 2 letters.

did this:
remove the ", 2" in the return thing
also added it back and used number "50" instead, still same problem.
Must be somewhere else to change that part.

poy
Posts: 83
Joined: 2006-04-03 15:55

Post by poy » 2006-06-06 17:26

do you get the 2 letters corresponding to the abreviation of the country, like before, or do you get the 2 first letters of the full country name ?

in the first case it must have something to do with the comma4 and comma5.

Zinden
Posts: 11
Joined: 2004-12-31 11:47
Location: Sweden
Contact:

Post by Zinden » 2006-06-07 10:25

i get 2 letters from coutry name...like Sw... Fi....

so it reads from right spot now...only need to get it to show more letters..

Carraya
Posts: 112
Joined: 2004-09-21 11:43

Post by Carraya » 2006-06-07 10:39

Zinden wrote:i get 2 letters from coutry name...like Sw... Fi....

so it reads from right spot now...only need to get it to show more letters..
The size of the column could be the limit :)
<random funny comment>

Zinden
Posts: 11
Joined: 2004-12-31 11:47
Location: Sweden
Contact:

Post by Zinden » 2006-06-07 10:49

the column is a new one...i mean a own column for country names...
and if only 2 letters fits then im gonna throw out my computer thru the window :lol:

poy
Posts: 83
Joined: 2006-04-03 15:55

Post by poy » 2006-06-07 16:53

i'm not sure at all, but you could try changing the u_int16_t to a bigger type, like u_int32_t.

PseudonympH
Forum Moderator
Posts: 366
Joined: 2004-03-06 02:46

Post by PseudonympH » 2006-06-07 19:59

4 character abbreviations are not much better than 2 character ones :)

Zinden
Posts: 11
Joined: 2004-12-31 11:47
Location: Sweden
Contact:

Post by Zinden » 2006-06-08 07:00

poy wrote:i'm not sure at all, but you could try changing the u_int16_t to a bigger type, like u_int32_t.
That didnt change anything....

Pothead
Posts: 223
Joined: 2005-01-15 06:55

Post by Pothead » 2006-06-08 08:29

BBcode enabled removes parts of the function :roll:
The answer looks like it lies in the parts of the code that got removed from the post.

Code: Select all

		u_int32_t ipnum = (Util::toUInt32(IP.c_str()) << 24) | 
			(Util::toUInt32(IP.c_str() + a + 1) << 16) | 
			(Util::toUInt32(IP.c_str() + b + 1) << 8) | 
			(Util::toUInt32(IP.c_str() + c + 1) );

		CountryIter i = countries.lower_bound(ipnum);

		if(i != countries.end()) {
			return string((char*)&(i->second), 2);
		}
	}

	return Util::emptyString; //if doesn't returned anything already, something is wrong...
}
Now it's possible to see this line

Code: Select all

return string((char*)&(i->second), 2);
Which kind of explains the 2 char limit.

Zinden
Posts: 11
Joined: 2004-12-31 11:47
Location: Sweden
Contact:

Post by Zinden » 2006-06-08 08:33

i removed that ,2
tried to have ,50 insted......

still limit to 2 characters.....

gone thru code alot of times, tried different things with no success.

Pothead
Posts: 223
Joined: 2005-01-15 06:55

Post by Pothead » 2006-06-08 10:37

Well, that limit of 2, was restricting the size of country name it would return.
Looks like you are only storing the first 2 letters. :P
Try using break points, at see where it's going wrong.

Could you also disable BBCode in your original post, so we can see all the code, instead of refering to files. :)

Also try changing
u_int16_t* country = (u_int16_t*)(start + comma4 + 2);
to use strings instead of u_int16_t's, fixing whatever that breaks. :)

Locked