Data Engineer Manager
City of London, London / £90000 - £100000
£90000 - £100000
City of London, London
SENIOR DATA ENGINEER
UP TO £85,000 + BENEFITS
LONDON, CURRENTLY FLEXIBLE / REMOTE / HOMEWORKING
Harnham is partnered with a global travel company, that uses real-time data to bring together relevant information for travellers to make their journey as easy as possible. This is a 50% hands-on and 50% management-based role - giving you the opportunity to work on technical projects alongside metering and working with stakeholders.
This is a very exciting time to join the company, it is the leading travel platform currently used in several countries with over 100 million customers and there is no sign of them slowing down. The company wants to use cutting-edge technology to grow the business further and provide the most up-to-date information to their customers with real-time travel information.
- Managing the infrastructure and architecture of the team.
- Maintaining all data pipelines are running appropriately to ensure they are understanding their customers.
- Build on the company's current platform that uses real-time data through Kafka to provide customers with real-time travel information.
- Leading on project/product implementations within the cloud.
YOUR SKILLS & EXPERIENCES
- Highly proficient in Python and SQL programming.
- Previous experience working in AWS.
- An understanding of Spark/Pyspark.
- Good knowledge of functional programming in paradigms.
HOW TO APPLY:
Please register your interest by sending your CV to Joy Bruty via the apply link on this page. (The company has outlined a fully remote interview process and has a remote on boarding policy in place).
Data Engineer Or Software Engineer: What Does Your Business Need? | Harnham US Recruitment post
We are in a time in which what we do with Data matters. Over the last few years, we have seen a rapid rise in the number of Data Scientists and Machine Learning Engineers as businesses look to find deeper insights and improve their strategies. But, without proper access to the right Data that has been processed and massaged, Data Scientists and Machine Learning Engineers would be unable to do their job properly. So who are the people who work in the background and are responsible to make sure all of this works? The quick answer is Data Engineers!… or is it? In reality, there are two similar, yet different profiles who can help help a company achieve their Data-driven goals. Data Engineers When people think of Data Engineers, they think of people who make Data more accessible to others within an organization. Their responsibility is to make sure the end user of the Data, whether it be an Analyst, Data Scientist, or an executive, can get accurate Data from which the business can make insightful decisions. They are experts when it comes to data modeling, often working with SQL. Frequently, “modern” Data Engineers work with a number of tools including Spark, Kafka, and AWS (or any cloud provider), whilst some newer Databases/Data Warehouses include Mongo DB and Snowflake. Companies are choosing to leverage these technologies and update their stack because it allows Data teams to move at a much faster pace and be able to deliver results to their stakeholders. An enterprise looking for a Data Engineer will need someone to focus more on their Data Warehouse and utilize their strong knowledge of querying information, whilst constantly working to ingest/process Data. Data Engineers also focus more on Data Flow and knowing how each Data sets works in collaboration with one another. Software Engineers – DataSimilar to a Data Engineers, Software Engineers – Data ( who I will refer to as Software Data Engineers in this article) also build out Data Pipelines. These individuals might go by different names like Platform or Infrastructure Engineer. They have to be good with SQL and Data Modeling, working with similar technologies such as Spark, AWS, and Hadoop. What separates Software Data Engineers from Data Engineers is the necessity to look at things from a macro-level. They are responsible for building out the cluster manager and scheduler, the distributed cluster system, and implementing code to make things function faster and more efficiently. Software Data Engineers are also better programers. Frequently, they will work in Python, Java, Scala, and more recently, Golang. They also work with DevOps tools such as Docker, Kubernetes, or some sort of CI/CD tool like Jenkins. These skills are critical as Software Data Engineers are constantly testing and deploying new services to make systems more efficient. This is important to understand, especially when incorporating Data Science and Machine Learning teams. If Data Scientists or Machine Learning Engineers do not have a strong Software Engineers in place to build their platforms, the models they build won’t be fully maximized. They also have to be able to scale out systems as their platform grows in order to handle more Data, while finding ways to make improvements. Software Data Engineers will also be looking to work with Data Scientists and Machine Learning Engineers in order to understand the prerequisites of what is needed to support a Machine Learning model. Which is right for your business? If you are looking for someone who can focus extensively on pulling Data from a Data source or API, before transforming or “massaging” the Data, and then moving it elsewhere, then you are looking for a Data Engineer. Quality Data Engineers will be really good at querying Data and Data Modeling and will also be good at working with Data Warehouses and using visualization tools like Tableau or Looker. If you need someone who can wear multiple hats and build highly scalable and distributed systems, you are looking for a Software Data Engineer. It’s more common to see this role in smaller companies and teams, since Hiring Managers often need someone who can do multiple tasks due to budget constraints and the need for a leaner team. They will also be better coders and have some experience working with DevOps tools. Although they might be able to do more than a Data Engineer, Software Data Engineers may not be as strong when it comes to the nitty gritty parts of Data Engineering, in particular querying Data and working within a Data Warehouse. It is always a challenge knowing which type of job to recruit for. It is not uncommon to see job posts where companies advertise that they are looking for a Data Engineer, but in reality are looking for a Software Data Engineer or Machine Learning Platform Engineer. In order to bring the right candidates to your door, it is crucial to have an understanding of what responsibilities you are looking to be fulfilled.That’s not to say a Data Engineer can’t work with Docker or Kubernetes. Engineers are working in a time where they need to become proficient with multiple tools and be constantly honing their skills to keep up with the competition. However, it is this demand to keep up with the latest tech trends and choices that makes finding the right candidate difficult. Hiring Managers need to identify which skills are essential for the role from the start, and which can be easily picked up on the job. Hiring teams should focus on an individual’s past experience and the projects they have worked on, rather than looking at their previous job titles. If you’re looking to hire a Data Engineer or a Software Data Engineer, or to find a new role in this area, we may be able to help. Take a look at our latest opportunities or get in touch if you have any questions.
The Six Steps Of Data Governance | Harnham Recruitment post
The value that data analysis can provide to organisations is becoming increasingly clear. But with all the buzz around the endless ways that data can be used to revolutionise your business processes, it can be overwhelming to know where to start. Fundamentally, what you can do with your data and how useful it may be will hinge on its quality. This is the case no matter what data you may have, whether that be customer demographics or manufacturing inventories. High-quality data is also imperative for utilising exciting and innovative new technology such as Machine Learning and AI. It’s all very well investing in tech to harness your data assets to, for example, better inform decision making, but you won’t be able to glean any useful analysis if the data is full of gaps and inconsistencies. Many will be looking at this new tech and be tempted to run before they can walk. But building quality data sets and water-tight, long-lasting processes will form the foundation for any future developments and should not be overlooked. This is where Data Governance comes into its own.Data Governance (DG) is an effective step in improving your data and turning it into an invaluable asset. It has numerous definitions but according to Data Governance Institute (DGI), “Data Governance is the exercise of decision-making and authority for data-related matters.“Essentially DG is the process of managing data during its life cycle. It ensures the availability, useability, integrity and security of your data, based on internal data standards and policies that control data usage. Good data governance is critical to success and is becoming increasingly more so as organisations face new data privacy regulations and rely on data analytics to help optimise operations and drive business decision-making. As Ted Friedman from Gartner said: ‘Data is useful. High-quality, well-understood, auditable data is priceless.’Without DG, data inconsistencies in different systems across an organisation might not get resolved. This could complicate data integration efforts and create data integrity issues that affect the accuracy of business intelligence (BI) reporting and analytics applications.Data Governance programs can differ significantly, depending on their focus but they tend to follow a similar framework:Step 1: Define goals and understand the benefits The first step of developing a strategy should be to ensure that you have a comprehensive understanding of the process and what you would like the outcome to be.A strong Data Governance strategy relies on ‘buy in’ from everyone in the business. By stressing the importance of complying with the guidelines which you will later set, you will be helping to encourage broad participation and ensure that there is a concerted and collaborated effort to maintain high standards of data quality. Leaders must be able to comprehend the benefits themselves before communicating them to their team so it may be worth investing in training around the topic.Step 2: Analyse and assess the current dataThe next step is essentially sizing up the job at hand, to see where improvements might need to be made. Data should be assessed against multiple dimensions, such as the accuracy of key attributes, the completeness of all required attributes and timeliness of data. It may also be valuable to spend time analysing the root causes of inferior data quality.Sources of poor data quality can be broadly categorised into data entry, data processing, data integration, data conversion, and stale data (over time) but there may be other elements at play to be aware of.Step 3: Set out a roadmapYour data governance strategy will need a structure in which to function, which will also be key to measuring the progress and success of the program. Set clear, measurable, and specific goals – as the saying goes – you cannot control what you cannot measure. Plans should include timeframes, resources and any costs involved, as well as identifying the owners or custodians of data assets, the governance team, steering committee, and data stewards who will all be responsible for different elements. Including business leaders or owners in this step will ensure that programs remain business-centric.Step 4: Develop and plan the data governance programBuilding around the timeline outlined you can then drill down to the nitty-gritty. DG programs vary but usually include:Data mapping and classification – sorting data into systems and classifying them based on criteria.Business glossary – establishing a common set of definitions of business terms and concepts – helping to build a common vocabulary to ensure consistency.Data catalogue – collecting metadata and using it to create an indexed inventory of available data assets.Standardisation – developing polices, data standards and rules for data use to regulate proceduresStep 5: Implement the data governance programCommunicating the plan to your team may not be a one-step process and may require a long-term training schedule and regular check-ins. The important thing to realise is that DG is not a quick fix, it will take time to be implemented and fully embraced. It also may need tweaks as it goes along and as business objectives change. All DG strategies should start small and slowly build up over time – Rome wasn’t built in a day after all. Step 6: Close the loopArguably the most important part of the process is being able to track your progress and checking in at periodic intervals to ensure that the data is consistent with the business goals and meets the data rules specified. Communicating the status to all stakeholders regularly will also help to ensure that a data quality discipline is maintained throughout.Looking for your next big role in Data & Analytics or need to source exceptional talent? Take a look at our latest Data Governance jobs or get in touch with one of our expert consultants to find out more.
Where Does the Data Engineer Sit in the Data Chain? | Harnham US Recruitment post
Data Scientist. Data Architect. Data Engineer. With so many professional titles in the Data and Technology space, it can be difficult to distinguish one from another. You may have an interest in Data, but aren’t sure which field you’d like to move into, and as things become more specialized, it adds another layer of education and experience required to make the move.Every one of the titles above has a place and a responsibility along the Data chain. But some may be more well-known than others. In order to wrangle Data, clean and analyze it, or develop programming from it, you need someone to build and maintain the pipeline systems that give Data Scientists a map to follow when collecting, cleaning, and analyzing the data.Though not interchangeable, the Data Scientist and the Data Engineer work together as two halves of a whole on the Data team. One role crafts the roadmap or blueprint for others to follow while the other gathers insights from the data based on specific datasets requested and designed by the Data Engineer.So, let’s look first at what a Data Engineer does and the skillsets required for the role.WHAT IS A DATA ENGINEER?A Data Engineer takes the blueprints from the Data Architect and creates the pipelines. It sounds simple. But it isn’t. A Data pipeline is just like it sounds. It is the process Data goes through from inception to implementation, and the technologies and frameworks involved at an in-depth level which can involve up to 30 different technologies. So, a Data Engineer is responsible for developing, testing, and maintaining the data pipeline. That’s a lot of wrangling, cleaning, and prepping to ensure reliable information is filtered to the Data Scientist. 3 TYPES OF DATA ENGINEER ROLES1. Generalist – This role is often found on small teams, and though this role may understand processes, but not necessarily systems, it’s a good place to begin if you’re a Data Scientist interested in stepping into a Data Engineer role. The focus here is end-to-end collection to processing of Data.2. Pipeline – You’ll find this role conquering more complicated projects on midsize Analytics team. The Pipeline focused Data Engineer is found in medium to larger-size businesses.3. Database – The Database focused engineer is found most often in larger businesses with distributed systems across several databases. These individuals are responsible for implementing what the Data Architect has created, and collecting the information to inform analytics databases.7 SKILLS REQUIRED FOR DATA ENGINEERData Engineers are the ones who keep everything running smoothly. Even if a technology doesn’t necessarily fall within their scope of responsibilities, they should still understand it, and be able to prepare Data for it. This is particularly the case when it comes to Machine Learning. Though it’s more aligned with Data Scientist, a Data Engineer should know enough about it craft algorithms and gather insights.Below are a few more technical skills a Data Engineer should have to be successful in their role.1. Know and understand the right tools for the job2. Technical Skills include:3. Linux4. SQL5. Python6. Kafka, Flink, and Kudu languages for processing frameworks and storage engines, and which tool is best for which task.7. General understanding of distributed systems and how they’re different from traditional systems.The role of the Data Engineer is unique in that how this person thinks depends on what needs to be done. In some cases, you’ll need to think like an engineer, and in other cases, you’ll need to think like a product manager. This is one of the reasons it’s important to have such deep knowledge of systems, processes, and knowing the right tool, and the right person for the job.If you’re looking for your next role in Big Data, Analytics, Software Engineering, or Computer Vision, Harnham may have a role for you. Check out our current vacancies or contact one of our expert consultants to learn more.For our West Coast Team, contact us at (415) 614 – 4999 or email email@example.com.For our Mid-West and East Coast teams, contact us at (212) 796-6070 or email firstname.lastname@example.org.
CAN’T FIND THE RIGHT OPPORTUNITY?
If you can’t see what you’re looking for right now, send us your CV anyway – we’re always getting fresh new roles through the door.