MESSAGE
DATE | 2011-05-16 |
FROM | Ruben Safir
|
SUBJECT | Re: [NYLXS - HANGOUT] C++ Workshop - Notes
|
on our way to adding the standard deviation to our Distribution template class we first created a templated function called mean, to determine the mean value for the population of all the different values in our chainlist::List< Distirubtion > objects
The code is as follows
/* * ===================================================================================== * * Filename: stats.h * * Description: simple distribution execise * * Version: 1.0 * Created: 04/15/2011 06:12:42 PM * Revision: none * Compiler: gcc * * Author: Ruben Safir, * Company: * * ===================================================================================== */ #ifndef STATS_H #define STATS_H #include #include #include #include #include #include "linklist.h"
/* * ===================================================================================== * Class: Distribution * Description: Keeps Track of Distribution of 6's (or anything else) * in a series of List * ===================================================================================== */
namespace stats{
template class Distribution { template friend std::ostream & operator<<(std::ostream &, const Distribution&); public: /* ==================== LIFECYCLE * ======================================= */ Distribution (T descr, int occurances = 0); Distribution():freq(NULL),occurances(NULL){}; // Distribution ( const Distribution &other ); /* copy // constructor */ // ~Distribution (); /* // destructor */ /* ==================== ACCESSORS * ======================================= */ T description()const{ return freq;} int population()const { return occurances; } /* ==================== MUTATORS * ======================================= */ void increase_occ(){ ++occurances; std::cout << "description " << freq << " occurances " << occurances << std::endl; } void descrease_occ(){ --occurances; } /* ==================== OPERATORS * ======================================= */ //Distribution& operator = ( const Distribution &other ); /* //assignment operator */ int operator()(){ return freq; } bool operator==(Distribution &tmp){ if(this->freq == tmp.freq) return true; return false; }
bool operator<(Distribution &tmp){ if(freq < tmp.freq) return true; return false; } chainlist::List< stats::Distribution > * tally; //a list of distribution talleys
float stddev(chainlist::List > *);
protected: /* ==================== DATA MEMBERS * ======================================= */ private: /* ==================== DATA MEMBERS * ======================================= */ T freq; //description of how many times found in a List int occurances; //description of how many times a frequency was found in a list }; /* ----- end of class Distribution ----- */
template std::ostream & operator << ( std::ostream & os, const Distribution & obj ) { T desc = obj.description(); int pop = obj.population(); os << "The Identification of " << desc << " was seen " << pop ; return os; } /* ----- end of function operator << ----- */
template Distribution::Distribution(T descr, int occ): occurances(occ){ freq = descr; }
//calculation standard deviation of distribution list // template float Distribution::stddev(chainlist::List > * tally){ float dev;
return dev; }
/* Routine to go though a single list and add it to an existing * distribution table */ template void mount_individual_data_point(chainlist::List * tabulate, chainlist::List > * table);
/* Routine to find all the occurances of a type in a list of lists */ template void take_tally(chainlist::List *,chainlist::List > *);
template void take_tally(chainlist::List * tabulate, chainlist::List > * table){ for(tabulate->cursor()=tabulate->front();tabulate->cursor() != tabulate->endd(); tabulate->cursor( tabulate->cursor()->next() ) ){ //build distribution list mount_individual_data_point(tabulate, table); } //we are at the end of tabulate mount_individual_data_point(tabulate, table); table->sort(*table); }
template void mount_individual_data_point(chainlist::List * tabulate, chainlist::List > * table){ int val; stats::Distribution * j; val = *(tabulate->cursor()->value()); //get a value table->cursor()= table->front(); //check to see if the distribution list exists if(!table->cursor()){ // if not add a distribution table to the List of distributions j = new stats::Distribution (val); table->insert(*j ); //now we have at least one delete j; j=table->cursor()->value();//and increased its population j->increase_occ(); }else{ //otherwise search for a distribution node described as //value table->find_value(val); if( table->cursor() ){ j=table->cursor()->value();//and increase its population j->increase_occ(); }else{//otherwise add a new node j = new stats::Distribution (val); table->insert( *j ); //now we have one for that value delete j; j=table->cursor()->value();//and increased its population j->increase_occ(); } } }
template float mean_list(chainlist::List< Distribution > * tally){ if(tally->endd() == 0){ std::cout << "Empty List" << std::endl; return; }
int sum = 0;
tally->cursor() = tally->front(); while(tally->cursor() != tally->endd() ){ sum += tally->cursor()->value()->population() ; tally->cursor(tally->cursor()->next()); } sum += tally->cursor()->value()->population() ;
sum += tally->curosor->value()->population();
return sum/(tally->size()); }
} #endif /* STATS_H */
The homework assignment for Tuesday is to finish the Standard deviation forumula which is defined as follows
quote wikipedia:: http://en.wikipedia.org/wiki/Standard_deviation
consider a population consisting of the following eight values:
2,\ 4,\ 4,\ 4,\ 5,\ 5,\ 7,\ 9
These eight data points have the mean (average) of 5:
\frac{2 + 4 + 4 + 4 + 5 + 5 + 7 + 9}{8} = 5 <<===this is the return value of stats::mean_list
To calculate the population standard deviation, first compute the difference of each data point from the mean, and square the result of each:
\begin{array}{lll} (2-5)^2 = (-3)^2 = 9 && (5-5)^2 = 0^2 = 0 \\ (4-5)^2 = (-1)^2 = 1 && (5-5)^2 = 0^2 = 0 \\ (4-5)^2 = (-1)^2 = 1 && (7-5)^2 = 2^2 = 4 \\ (4-5)^2 = (-1)^2 = 1 && (9-5)^2 = 4^2 = 16 \end{array}
Next compute the average of these values, and take the square root:
\sqrt{ \frac{(9 + 1 + 1 + 1 + 0 + 0 + 4 + 16)}{8} } = 2
This quantity is the population standard deviation; it is equal to the square root of the variance. The formula is valid only if the eight values we began with form the complete population. If they instead were a random sample, drawn from some larger, “parent†population, then we should have used 7 (which is n − 1) instead of 8 (which is n) in the denominator of the last formula, and then the quantity thus obtained would have been called the sample standard deviation. See the section Estimation below for more details.
A slightly more complicated real life example, the average height for adult men in the United States is about 70", with a standard deviation of around 3". This means that most men (about 68%, assuming a normal distribution) have a height within 3" of the mean (67"–73") — one standard deviation — and almost all men (about 95%) have a height within 6" of the mean (64"–76") — two standard deviations. If the standard deviation were zero, then all men would be exactly 70" tall. If the standard deviation were 20", then men would have much more variable heights, with a typical range of about 50"–90". Three standard deviations account for 99.7% of the sample population being studied, assuming the distribution is normal (bell-shaped).
The code is at
http://www.nylxs.com/docs/workshops/stats.h.html
Ruben
|
|