Thursday, August 22, 2013

A Better Container Choice?

Newsgroup: comp.lang.c++

Subject: A Better Container Choice?

From: mrc2323@...

Date: Thu, 22 Aug 2013 12:52:37 -0700



My current application has 2 large data sets that are combined into a

single data set that I must access by (part of) a string value.

Currently I have the structure declared as a map object, but after

populating the basic information I am adding information from another

database that's much larger - in a many-to-one situation.

Here's the fundamental information I use:

struct Res_Struct // Individual Event Finisher data

{

int resEvtNum; // link to Events table

int resYear; // Event Year

int resOAll; // OverAll Finish position

int resD_P; // Division Place

long resTime; // Finish Time

} resWork;

struct Hist_Fins // individual Finisher's results

{

int evtNum; // Result's Event # link

string PRF; // P/R indicator

Res_Struct histInfo; // Finisher's result(s) info

} histWork;

vector<Hist_Fins>::iterator hIter;

struct Fin_Struct // Individual Finisher data

{

long finLink; // unique Finisher (link)

char finGender; // gender

int finCount; // # Finishes by this participant

string finName; // Finisher Name (Last, First M.)

string finDoB; // (derived) DoB from event Age/Year

vector<Hist_Fins> histVect;

} finWork;

map<int, Fin_Struct> finMap;

map<int, Fin_Struct>::iterator fIter;



Yes, this seems a bit convoluted, but the application has been

growing in size and complexity, and I've not had time to redesign...

The important issue here is that I have ~160,000 records that

construct the basic information in the Fin_Struct. My other data (~

400,000 records) comprise the information that populates the "histVect"

object - 1-200 vector items in each map object. The input data files

are flat text data files (referencing some earlier posts about file I/o

efficiency).

Note that the map has an integer key value, and values range from 101

through ~160,000. I don't use the "name" as a key because I normally

scan the entire map object to look for objects that match some part of

the name value (e.g. I want to find all objects with names that start

with "WAL", etc.).

The use of an STL map doesn't seem best, because I don't use the map

in a traditional way, and the loading of the map takes a lot of time

<sigh>. Since the data objects are consecutive in an integer range, I

wonder if another container would be a better choice. I could use a

vector (and reserve a good amount of space "going in", rather than let

slow runtime grow occur), but I think I'd lose significant "load time"

by not referencing a map as I'd have to scan the vector 400,000 or more

times during the 2nd file population...

Both files contain the integer value that links them, as well as the

"name" string.

Any thoughts? TIA







via Usenet Forums - Usenet Search,Free Usenet - comp.lang.c++ http://www.pocketbinaries.com/usenet-forums/showthread.php?71655-A-Better-Container-Choice&goto=newpost

View all the progranning help forums at:

http://www.pocketbinaries.com/usenet-forums/forumdisplay.php?128-Coding-forums

No comments:

Post a Comment