Zählen Sie die Anzahl der Wörter in einer Textdatei

83628
Deepak Singh

Ist das ein guter Ansatz oder gibt es eine andere Lösung, der ich nicht bewusst bin?

//C++ program to count number of words in text file

#include<fstream>
#include<iostream>
#include<string>
using namespace std;

int main()
{
    ifstream inFile; //Declares a file stream object
    string fileName;
    string word;
    int count = 0;

    cout << "Please enter the file name ";
    getline(cin,fileName);

    inFile.open(fileName.c_str());

    while(!inFile.eof())
    {               
        inFile >> word; 
        count++;
    }

    cout << "Number of words in file is " << count;
    inFile.close();

    cin.get();  
    return 0;
}
Antworten
14
Bitte fügen Sie einen Kommentar hinzu, wie Ihre Klasse zu tun hat. Gibt es bestimmte Dinge, die wir überprüfen müssen? chillworld vor 6 Jahren 2

4 Antworten auf die Frage

19
Jerry Coffin

Your code has a few problems.

  1. You should learn to not use using namespace std;. It's generally frowned upon.
  2. You should never use while(!inFile.eof()). It's pretty much a guaranteed bug.
  3. You should use standard algorithms when they're applicable (as they are here).
  4. Prefer to fully initialize variables at creation (e.g., pass file name when you create a filestream object).
  5. Generally let the destructor handle destruction (e.g., let the filestream object close when it goes out of scope)1.

I'd also strongly prefer to take command line arguments over prompting for input at run time.

I'd write the code using the standard distance algorithm, something like this:

#include <fstream>
#include <algorithm>
#include <iterator>
#include <iostream>
#include <string>
#include <cstdlib>

int main(int argc, char **argv) {
    if (argc < 2) {
        std::cerr << "Usage: count_words <infile>\n";
        return EXIT_FAILURE;
    }

    std::ifstream infile(argv[1]);

    std::istream_iterator<std::string> in{ infile }, end;

    std::cout << "Word count: " << std::distance(in, end);
}

  1. There are a few cases where it makes sense to manually close a file. For example, if you're moving a file between file systems by copying its content, then deleting the original, you want to do everything you can to ensure the copy completed successfully (including successful closing) before you remove the original. Anything that might destroy the user's data calls for extraordinary measures to assure safety. This isn't one of those cases though.
Ich würde nicht zwei Variablen in eine Zeile setzen (wie bei "in" und "end"). BЈовић vor 6 Jahren 0
Verdammt seist du. Das ist eine schöne einfache Frage. Das eine würde ich hinzufügen (was keine gesonderte Antwort wert ist). Es ist nicht nötig, die geöffnete Datei in einer separaten Zeile (z. B. Konstruktor) zu stören oder eine Datei manuell zu schließen (z. B. dstructor. RAII verwenden). Martin York vor 6 Jahren 0
14
Corbin

using namespace std; is a bad habit.


Declare variables as close to use as possible. (for example, declare count about the while loop rather than at the top).


An eof loop control does not work the way you think it does. eof is not reached until after the end of file is attempted to be read past. This means that your last read can silently fail.

The more natural way to write that loop is:

while (inFile >> word) { ++count; }

I would put a newline after the count output. It's rare to have a program output without a trailing newline. (Although I'm only familiar with linux--maybe no break is normal under Windows.)


Rather than using cin.get(), I would just run the application inside of a console. It's a bit of a strange behavior to have the program hang until you hit a key. Imagine if programs like grep, cat or wc did that. It would be a pain to use.


Since you're only using one parameter and there's no real advantage to using a user prompt, I would take the filename as an argument to the program rather than reading it from the console. (In other words, I would use argv.)


If a program can't fail, it's fairly common to omit a return code. That clearly signals that the program can't fail. (Note: main is a special case. Return values are always required in other functions.)


If you wanted to leverage the standard library, you could actually do this much shorter:

ifstream fs(...);
std::size_t count = std::distance(std::istream_iterator<std::string>(fs), 
                                  std::istream_iterator<std::string>());

The specifics of this are a bit advanced, but the basic idea is that istream_iterator is a simple wrapper around operator>> and that it performs extractions until it no longer can. Since std::distance runs the length of the first iterator until the second, this will just read as many tokens as it can and return the distance (i.e. count). (Technically it doesn't always actually iterate along the iterators. When it can, it will just do a simple subtraction. That's irrelevant here though.)


I wouldn't bother closing the file. Unless you plan on doing actual error handling, it's best to just let the scope of the file handle the close of it. When it goes out of scope, the destructor will run which will close the file if it's still open.

2
CroCo

Außerdem müssen Sie noch einmal überprüfen, ob die Datei erfolgreich geöffnet wurde. Andernfalls sollten Sie sie nicht bearbeiten.

if ( inFile.fail() )
{
  // do something
  return -1;
}
Es ist normalerweise eine schlechte Idee, negative Werte zurückzugeben. Die einzigen garantierten Werte sind "EXIT_SUCCESS" und "EXIT_FAILURE". Wichtiger jedoch ist, dass bestimmte Systeme (wie Linux) vorzeichenlose Zahlen verwenden, um den Exitstatus darzustellen, so dass -1 lautlos zu einer großen positiven Zahl wird. Es spielt keine Rolle, solange Sie nur "code == 0" prüfen oder nicht, aber da es keinen Vorteil gegenüber "EXIT_FAILURE" gibt, gibt es keinen Grund, etwas anderes zu verwenden. Corbin vor 6 Jahren 1
1
Charles Chow

Es ist besser, using namespaceden Code-Block (main ()) zu verwenden oder ihn nicht zu verwenden. Andernfalls werden beim Schreiben eines großen Projekts einige Namensraumkonflikte erzeugt.