⌛⌛ Text search inside a 7z file¶
Place the pom.xml
file in the round7/sevenzipsearch
directory of your local repository
and create to this directory the src/main/java
subdirectory. You may name freely your
classes freely, but they must belong to the fi.tuni.prog3.sevenzipsearch
package. Therefore,
your files must be in the round7/sevenzipsearch/src/main/java/fi/tuni/prog3/sevenzipsearch
directory and the files must have the package fi.tuni.prog3.sevenzipsearch;
statement
at the very beginning of the class.
NetBeans creates the pom.xml
file and the directory structure for this task
automatically, provided that you enter correct values, while creating a Maven project. Please,
see below the values for the groupId
-, artifactId
- ja ``version``elements.
There are test material available in the round7/sevenzipsearch
directory of
the remote material repository.
In this task, you will try handling files compressed with 7z compression by using the Apache Commons Compress library. You will actually need the following two libraries as dependencies:
Apache Commons Compress
Info page from Maven Central: https://search.maven.org/artifact/org.apache.commons/commons-compress/1.21/jar
Documentation: https://commons.apache.org/proper/commons-compress/examples.html
XZ
Info page from Maven Central: https://search.maven.org/artifact/org.tukaani/xz/1.9/jar
Note! This dependency is needed only indirectly: Apache Commons Compress library uses it internally. Your code does not need to refer to this library anywhere. It is enough that you add the dependency definition into the
pom.xml
file.
7z files (Wikipedia article) are similar to zip files but they use a more efficient compression algorithm.
Your task is to implement a program that searches for occurrences of a given search word in the text files contained in a given 7z file. To be more precise, the program must work as follows:
The prints the prompt
File:
, after which the file name is read from the user. Next the promptQuery:
is printed and the word to be searched is read. Afterwards one empty line is printed.The program scans the files in the given 7z file and performs a word search in each found text file.
A file is recognized as a text file based on its ending: the search is performed if and only if the file ending is
.txt
”.At the beginning of each search the name of the file is printed out.
At the end of each search one extra new line is printed.
Performing the word search:
The file is read one line at a time and all occurrences of the search word are searched from each line, ignoring character case.
If at least one occurrence is found, the line in question is printed out in the form “
line number: line
”, whereline number
is the number of the line in question (the first line in the file has number 1), andline
is the line in question formatted in such a way that all occurrences of the search word have been changed to use upper case letters.
The example outputs clarify the format.
The automatic tests, and the ones given below, assume that you make the following definitions
in your pom.xml
project file:
The value of
groupId
isfi.tuni.prog3
.The value of
artifactId
issevenzipsearch
.The value of
version
is1.0
.The values of the
maven.compiler.source
andmaven.compiler.target
elements are17
or lower. The grader uses Java 17, so any newer versions won’t work.A Onejar plugin definition where the value of
mainClass
element refers to the main class of your program, which you can name freely in this task. For example, if your main class is namedSevenZipSearch
, the element value isfi.tuni.prog3.sevenzipsearch.SevenZipSearch
.
Hint: Heikki Hyyrö’s coding demonstration video in Chapter 7.2.
NB: Do NOT create several Scanner
instances reading the standard input (System.in
).
Only create one and use the same one for the whole program. Only the first Scanner
attached
to System.in
will receive input. The same holds true if you use another class like InputStreamReader
to read the user input.
Testing¶
You can test your program with the test input files java.7z
and Dracula.7z
, and the example
output files output1.txt
, output2.txt
and output3.txt
. Your program finds the files
without any additional definitions when they are at their original location, that is,
the round7/sevenzipsearch
directory of your local repository.
When implementing the task, it might be a good idea to inspect the contents of the 7z files
java.7z
and Dracula.7z
. Many operating systems, for example Ubuntu Linux, know how to open 7z
files without a separate program. Otherwise you can use some compression program that supports the
7z format. A suitable choice might be 7-zip (https://www.7-zip.org/),
which is installed on the university computers.
Compile your program with mvn package
and run the tests as
java -jar target/sevenzipsearch-1.0.one-jar.jar
in the sevenzipsearch
directory, that is, in the root directory of the project.
For the first test the file name is java.7z
and the search word Oracle
, for the second
one Dracula.7z
and under
and for the third one Dracula.7z
and press
.
The expected outputs of these three tests are depicted in the output1.txt
,
output2.txt
and output3.txt
files.
A+ presents the exercise submission form here.