Filtering Documents Loaded from a Directory
If -input_file_path
names a directory, mlcp loads all the documents in the input directory and subdirectories by default. Use the -input_file_pattern
option to filter the loaded documents based on a regular expression.
Note
Input document filtering is handled differently for -input_file_type forest
. For details, see Filtering Forest Contents.
For example, the following command loads only files with a “.xml” suffix from the directory /space/bill/data
:
# Windows users, see Modifying the Example Commands for Windows $ mlcp.sh import -host localhost -port 8000 -username user \ -password password -input_file_path /space/bill/data \ -mode local -input_file_pattern '.*\.xml'
The mlcp tool uses Java regular expression syntax. For details, see Regular Expression Syntax.