It is time to verify the implementation w.r.t the specifications (we hardly looked at the specifications before).
It turns out that the system does not completely correspond to the specifications:
httpstats config log
httpstats log [config]
format path 1-4 select path / exclude path /robots.txt
Changing the order is trivial, as is checking whether there is only 1 argument. Also, supplying the default configuration can be done quickly in httpstats.C: we change the type of config_file from fstream to istream*. If the configuration file is specified on the commandline, config_file will be initialized as follows:
config_file = new ifstream(config_file_name.c_str());
const string default_config_text( "format path 1-4\nselect path /\nexclude path /robots.txt");
config_file = new strstream(default_config_text.data(), default_config_text.size());
format path 1
Only now do we realize that we can probably assume that all paths in log entries start with "/", which means that we do not have to store the initial "" in the vector representation of Path. However, if we drop the initial "" component, how will we represent the path "/" ? Just reserving the vector <""> for that seems reasonable. Surprisingly, this change is extremely simple to implement: in Path::parse, we replace
string::size_type n0(ps.find_first_not_of('/',0)); if (n0!=0) // the path starts with '/' which we encode as a first "/" component. path_.push_back(""); if (n0==string::npos) // ps must be "/" return true;
string::size_type n0(ps.find_first_not_of('/',0)); if (n0==string::npos) { // ps must be "/" path_.push_back(""); return true; }
select path /~
Probably, a similar problem exists for domains, but since the specifications are not clear about that, we stick to our present implementation for Domain (and DatePattern).
The problem is not too hard to fix: in PathExpression::eval(), we should compare the strings instead of the the vectors: instead of
bool PathExpression::eval(const LogRecord& r) const { const Path& pr(r.path()); if (pr.size() < path_.size()) return false; return equal(path_.begin(), path_.end(), pr.begin()); }
bool PathExpression::eval(const LogRecord& r) const { const string& ps(r.path().str()); if (ps.size() < path_.str().size()) return false; return equal(path_.str().begin(), path_.str().end(), ps.begin()); }
Unfortunately, the string corresponding to a Path is currently not available in the Path. A quick fix would be to just store the parameter string of Path::parse(const string&) as a data member that is then returned by Path::str(). That is, however, not a good idea, since such an input string may contain unwanted / 's, as in "/a//b/".
We could also compute a ``clean'' string representation, like operator<<(ostream &,const Path &) does. The question then is: do we compute such a string data member once, e.g. at the end of Path::parse() and let subsequent calls return it or do we only compute it on demand? Note that, if there is no select on path, we will never use a PathExpression and, consequently, the vast majority of Path's will not need Path::str().
We decide to have a simple yet reasonable solution: we will use ``memoing '' where the result of an expensive ``on demand'' operation, in this case Path::str(), is stored and returned upon later calls to the operation. Thus we will add an extra (initially empty) string data member, called Path::str_, which will be used to remember the result of the first call to Path::str().
class Path { public: .. const string& str() const { if (str_.size()==0) { // compute str_ } return str_; } .. private: vector<string> path_; mutable string str_; };
Note also how we declare the Path::str_ data member of Path as mutable. This allows us to still have Path::str() as a const member function although it may update Path::str_.
Also, Path::parse() should clear Path::str_, since Path::path_ will no longer correspond with any nonempty Path::str_ that may have existed before the call to Path::parse().
In main() we change
Configuration config; if (!config.parse(*config_file)) { cerr << "Syntax error in configuration file \"" << config_file_name << "\"" << endl; return 1; }
Configuration config; unsigned int err_lineno(config.parse(*config_file)); if (err_lineno) { cerr << "Syntax error on line " << err_lineno << " in configuration file \"" << config_file_name << "\"" << endl; return 1; }