About the company

Our UniQue IT people are the most valuable part of Uni Systems; their knowledge and experience has made us the leading and reliable systems integrator of today and has contributed to our steady financial growth. We have created and are maintaining a stable working environment for our employees, with countless opportunities to innovate and thrive. Our work culture recognizes our UniQue IT people, supports the free sharing of ideas and the flow of information via open communication, while appreciating and effectively utilizing the talents, skills and perspectives of each employee. At Uni Systems, we are providing equal employment opportunities and banning any form of discrimination on grounds of gender, religion, race, color, nationality, disability, social class, political beliefs, age, marital status, sexual orientation or any other characteristics.

Job Summary

What will you be bringing to the team?

📍Setting up / improving pipelines to process all required documents that uniquely identify and trace decisions and processing steps. This is to be conducted on the provided classified sandbox environment, with provided performance hardware and toolsets. 📍Implementing / improving (missing) pipeline steps for marking duplicate files, based on file attributes, path (structure), and content (similarity), and establishing rules for determining if a file or structure is a duplicate. 📍Extracting document-format records from Functional Area Systems (FAS) databases and other performed backups. Archiving SMEs and system SMEs are available to guide target formats and interpret source system structure and data. Each FAS is processed individually; not all sprints address this item. 📍Processing / Monitoring various office, image, and video file types into accepted archiving formats, including metadata extraction and preparation of semantic indexes for search. 📍Automating the registration of all processed documents and their semantic indexes with the sandbox natural language search tool. 📍Automating the final transfer of all non-duplicate and extracted archive documents, including content and metadata, to the Institution archiving system. 📍Reporting the status, progress, and statistics of raw files being converted into archive formats, along with associated metadata and search indexes. 📍Delivering full reporting of results, pipeline step traceability, and documented (stakeholder-approved) exceptions.

Requirements

📍What do you need to succeed in this position? 📍Master’s degree in Computer Science, Engineering, or a relevant field (an advanced degree in Data Science is preferred) 📍At least 3 years of practical experience in the field of data science and/or data analytics 📍Experience using data processing, visualization, and analytics software packages and development environments, preferably such as KNIME, VS Code, GitLab, Power BI, Jupyter Lab, and Docker-based APIs 📍Experience with Big Data processing, creating and utilizing containerized building blocks, and running containers (APIs) on Kubernetes clusters 📍Proficient in programming/scripting languages such as Python, R, and SQL, and working with data formats like CSV, XML, and JSON

If this role isn’t the perfect fit, there are plenty of exciting opportunities in blockchain technology, cryptocurrency startups, and remote crypto jobs to explore. Check them on our Jobs Board.

Data Pipeline Engineer

About the company

Job Summary

What will you be bringing to the team?

Requirements

Salaries for similar jobs:

Similar jobs