Tutorial: Extracting CAD and 3D Graphic Image Data with HULFT Integrate REST Adapters
A global manufacturer of milling and hole-making tools maintains millions of product catalog pages that contain production information and complex specifications such as geometry data in three dimensions. The primary concern of this HULFT customer is maintaining the vast variability of parts and keeping their product catalog current with a simple and robust automated process that brings in better delivery to their down-stream customers and partners.
Leveraging HULFT Integrate, the HULFT team created a solution to automate this process of linking the most up-to-date parts data into their product catalog.
To prepare the product catalog automation process, we modeled a simple work flow solution in three easy steps:
- Understand the manufacturing tools data maintained in CAD and 3D Images.
- Leverage the time and response to extract the CAD and 3D Image data.
- Deliver the final output to the users in a required format.
On a daily basis, the customer receives high volume of spreadsheets with the product and part numbers data. This data is then collected in a timely manner and sent out for further down-stream processing by searching for the information on the web pages and by extracting and ingesting to the to the tools data management platform.
These steps were all addressed using the Hulft Integrate REST adapters for making http calls to a web server; then the extracted data CAD and 3D images from the successful http responses is ingested into the required target catalogs with the staging process flow that is being explained more in detail below. All the CAD data is extracted from the drawings and links present on these web pages by our powerful REST adapters.
Below is the workflow model developed in HULFT Integrate script:
- Excel Sheet Read
- REST Execute GET
- Read CSV File
- Write CSV File
Set Up and Configuration
1. Read the Global Part numbers from the excel file with our Excel adapters
To read the contents from the product catalog, define the file name and the excel sheet name under the Required settings tab of the excel adapter properties page. Open Excel Wizard. option highlighted in blue on the top right opens the Excel sheet to show the file contents in the excel.
To define the column list just click the Update column list that is highlighted in blue at the right below corner. This brings in all the columns metadata that is in Excel for you by eliminating the manual step of metadata preparation.
→ Input Data: GPN list.xlsx
2. REST API call: Get a response from the Webpage with the RESTful API GET operation
Config REST adapter: Configuring a REST adapter is a simple two-step process with in Integrate. Define the destination web address and the path of the incoming files that should be extracted data from. And then we set up the Response settings with the required Output destination either in the data format or the file format.
3. Map the Image Data attributes to the output to prepare the list of URLs
Map the Image Data attributes to the output to prepare the list of URLs containing the 3D Images and CAD Drawings. Response from the REST call are captured to a temporary file with the Write csv file operation in Integrate.
4. Locate the tags in the webpage and extract the 3D Images CAD drawings URLs
5. Extracted CAD drawing URL and browse to the corresponding web page
In this step we are pretty much repeating the GET operation REST call process to search and download the CAD Images and 3D objects located on the product URL.
6. Extract the CAD and 3D Images from the web page
Now we wait for the response from GET operation that we sent in step #5 and capture all the results of the 3D Images and CAD drawings to the output in the extracted file formats.
7. Validation and Data Ingestion
There is a mandatory validation step to check if the extracted images from the product URL by the REST call is valid. Otherwise the REST call returns an error response code which is captured within the Integrate and written to a log file. CAD Images are setup with the .dxf extension on their file path.
8. If valid: Ingest the 3D object data to the respective targets
If a response code is 4xx or 5xx that means an error occurred. All the error messages are captured to the log files with in the Integrate file system and only the valid objects are downloaded to the required local directory.
File formats extracted: .jpg; .dxf; .stp
9. If Invalid: Error Data Capturing and Notification
Invalid error messages that were captured to the log files can be set up to send out notifications to the appropriate owner via an email or other communication channel with in Hulft Integrate.
Each part number may contain multiple 3D images and CAD drawings. These extracted file formats are very specific to the manufacturing tools and products generated by CAD software. HULFT Integrate is extracting the required data from the web by using its powerful REST adapters to eliminate manual workflows and building a robust process automation that ingests data to destination. Here, the most laborious step is to identify and locate millions of drawings and 3D image data. HULFT Integrate not only automates the process but also provides the results in micro seconds with the millions of big data ingestion.