I am new to Pentaho kettle and have a requirement where we want to unzip a set of files and the paths to those files are there in a table. I am wondering how to go about it.
Unzip a list of files whose path has to be read from a table in Pentaho kettle
1.9k Views Asked by Sushant At
1
There are 1 best solutions below
Related Questions in PENTAHO
- MDX date range with NON EMPTY clause is not slicing the data for the range
- hide a sub report containing a chart pentaho
- Generating a dynamic date based on a row number using pentaho pdi
- Expired time for acquiring lock pentaho
- "Operation in progress..." Never Ends When Previewing Rows in Kettle Spoon via Salesforce
- How to use component javascript in the Pentahoo Data Integration
- Adding column to existing pentaho reports
- Function in postgres taking too long and how to trace back how many records has been updated
- Report ,subreport pentaho
- Pentaho 5.3 - How to read a property in PRPT files
- Dynimically load table text,Xml values into ETL Pentaho JOB
- pentaho can't access by chrome
- Pentaho report designer bulk change file location in subreports
- Stop Kettle/Spoon from crashing with one line R script
- Get rows from result step and Get Varaibles usage in Pentaho data Integeration
Related Questions in ETL
- Monolithic ETL to distributed/scalable solution and OLAP cube to Elasticsearch/Solr
- How to use component javascript in the Pentahoo Data Integration
- SSIS ETL parallel extraction from a AS400 file
- ETL Hangs - SQL Server in EC2 Machine + SSIS + AWS RDS SQL Server
- Pull Text file to SQL server 2008 table
- SqlAlchemy get all strings (don't cast to boolean or datetime)
- Best / simplest way to transfer data from one Oracle database to another
- Using blank-line delimited records and colon-separated fields in awk
- SSIS dynamic columns validation
- Is it possible to pass parameter inside With Clause in SQL Server SSIS Job?
- Easiest way to import a simple csv file to a graph with OrientDB ETL
- forwarding data from one source to another in real time
- SSIS Variable Scope Issues
- OrientDB ETL with self joined mysql table
- loop row by row from an excel file map to variable
Related Questions in KETTLE
- Generating a dynamic date based on a row number using pentaho pdi
- Expired time for acquiring lock pentaho
- "Operation in progress..." Never Ends When Previewing Rows in Kettle Spoon via Salesforce
- Stop Kettle/Spoon from crashing with one line R script
- Pentaho: Update a record of database using public parameter
- How to integrate Kettle with H2
- How upload file to Pentaho User Console server?
- Kettle, JDBC, MySQL, SSL: Could not Connetct to database
- how to make pentaho kette trust all self-signed certs?
- Kettle lauched inside java function change system properties
- Missing table input task in kettle GUI
- XML Join Step in Pentaho Kettle is very low in Performance
- Regular expression to delete XML element names
- How to launch two jobs simultaneously after the execution of one Job in Pentaho
- Kettle operation window get stuck at Linux OS,and always not responding
Related Questions in PENTAHO-DESIGN-STUDIO
- Adding column to existing pentaho reports
- Scheduling a job in pentaho 5.1 setting logging information to Email
- Dynamical variables in pentaho for Step - Table Input
- How do I can check an input file is compressed (ZIP) or not?
- Pentaho Kettle JSON Output
- Dynamic naming of excel sheets using pentaho kettle
- Pentaho - CDE How to use a one datasource with one execution for two components
- Issue with html pentaho report in openerp
- How to define the loop xpath in Pentaho
- The Pentaho BI Platform Workflow Issue
- Pentaho PDI on Ubuntu 18
- New/Import controller in Design Studio SAP
- Pentaho Demo Graphs or Charts
- Unzip a list of files whose path has to be read from a table in Pentaho kettle
- Pentaho Report Designer Database connection with variable on preview
Related Questions in PENTAHO-CDE
- Pentaho BI data source for custom OLAP (XML/A) provider
- MDX Query in Saiku Analytics (date string to date)
- Custom HTML Page in Pentaho User Console 5
- How to execute sql script in pentaho and store in xml.
- How to retrieve User or Admin Images??how to clear that CDA cache memory?
- How to create a dashboard from rest service inside pentaho?
- Scheduling a job in pentaho 5.1 setting logging information to Email
- Dashboard look and feel customization in pentaho
- How to call another dashboard in Pentaho using POST method
- Pass parameter to pentaho CDE report
- mongodb and pentaho CDE dashboard
- how to integrate D3 js chart in pentaho CDE
- How to Display Percentage values on TOP of CCC Bar Chart in CDE (Non-Stacked Bar Chart)
- Dynamic naming of excel sheets using pentaho kettle
- Pentaho - CDE How to use a one datasource with one execution for two components
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
This should be your main Job:
First transformation connects to your database and extract the paths, after this another Job is called (Unzip) which extracts those files. I'll be more clear, the next is the transformation called "Table input":
Use the "Table input" step to connect to your database. When you open it you have to create a new connecion and then put your query in the canvas. (make a query which extract values just from the interested column, not every columns). The step "Copy rows to result" gives the values form the database to the next job.
The following is the job "Unzip":
This job receives the values from the previous transformation and pass tose to the "Unzip file" job entry.
Things to know:
1) In the main job double click on the Unzip job icon, go to "advanced" and specify "Copy previous result to parameters" and "Execute for every input row". Of course in the Job specification you have to specify the path of this job.
2) Also double click on the Unzip job icon, go to parameters and put a parameter named as the value which you extract from the database:
3) Enter in the sub-Job (Unzip in my case) and right click, then go to "Job settings" and then to "parameters". Now put the same parameter name as before:
4) Remember to set the destination folder of the files and the receving parameters in the "Unzip files" job entry: