Senior Data Engineer

London
£450 - £550 per day

SENIOR DATA ENGINEER

6-MONTH CONTRACT

LONDON

£450-£550 PER DAY

This position as a Senior Data Engineer allows you to work within a dynamic finance company located in the heart of London. If you are tired of working repetitive projects that tend to merge into BAU, this contract will allow to work on technically varied projects daily. Development of advanced technological skills will rapidly progress your ability as Senior Data Engineer and will put you above competitors when looking to progress your career.

THE COMPANY:

Having operated within the global financial sector for over 20 years, operating as a Senior Data Engineer within this company offers you incredible technological resources. This will give an opportunity to implement them within exciting new projects that aim to improve the overall function of the company. Regarded as a top employer within this industry, having the experience of working alongside their permanent staff will develop your skills as a Senior Data Engineer and much more.

THE ROLE:

As a Senior Data Engineer, you have the opportunity to lead the architecture of the data pipeline and mentor Analysts to use newer technologies, as well as acting as the primary point of communication between tech and BI.

As a Senior Data Engineer, the objective is to implement real time analytics to improve data access speeds, by increasing daily ETL import regularity. This will involve:

  • Managing and collating external data from a variety of sources.
  • Implementing and mentoring Analysts in new technologies including SQL and Python to produce real time reports.
  • Move from daily ETL imports to 30-minute batches, to reduce risk of failure and increase uptime.
  • Liaising with external clients such as Microsoft and AWS to implement spikes.

KEY SKILLS AND REQUIREMENTS:

As a Senior Data Engineer, you will require the following background and skills:

  • High level capability in Python, SQL, Databricks and Google Dataproc.
  • Excellent experience working with unstructured data to create pipelines for analytics teams.
  • Have a solid understanding of the fundamental stages within the delivery life cycle of the BI solutions
  • Exposure to reporting tools such as Tableau and PowerBI.

HOW TO APPLY

Please register your interest by sending your CV to Matt Collett via the apply link on this page.

Send similar jobs by email
27463/mc
London
£450 - £550 per day
  1. Contract
  2. Business Intelligence

Similar Jobs

Salary

€45000 - €60000 per annum

Location

Paris, Île-de-France

Description

Dans le secteur du e-commerce, cette société réputée dans l'industrie de la logistique B2B et B2C est à la recherche d'un DÉVELOPPEUR INFORMATIQUE.

Salary

£350 - £450 per day

Location

London

Description

This position as a BI Consultant allows you to work within a dynamic consultancy company located in the heart of London.

Salary

€70000 - €80000 per annum

Location

Nordrhein-Westfalen

Description

We are looking for a Solution Designer with a focus on BI to join a newly created division in a hugely successful tech and network company in Germany.

Salary

£30000 - £40000 per annum + benefits

Location

Leicestershire

Description

Great role for a data literate Business Analyst with an understanding of SQL and BI tools to join a fast growing online retailer

Harnham blog & news

With over 10 years experience working solely in the Data & Analytics sector our consultants are able to offer detailed insights into the industry.

Visit our Blogs & News portal or check out our recent posts below.

From Broken Data Pipelines to Broken Data Headlines

This week's guest post is written by Moray Barclay. Two things have caused the UK’s Test & Trace application to lose 16,000 Covid-19 test results, both of which are close to my heart. The first is the application’s data pipeline, which is broken. The second is a lack of curiosity. The former does not necessarily mean that a data application will fail. But when compounded by the latter it is certain. Data Pipelines All data applications have several parts, including an interesting part (algorithms, recently in the news), a boring part (data wrangling, never in the news), a creative part (visualisation, often a backdrop to the news), and an enabling part (engineering, usually misunderstood by the news).  Data engineering, in addition to the design and implementation of the IT infrastructure common to all software applications, includes the design and implementation of the data pipeline. As its name suggests, a data pipeline is the mechanism by which data is entered at one end of a data application and flows through the application via various algorithms to emerge in a very different form at the other end. A well architected data application has a single pipeline from start to finish. This does not mean that there should be no human interaction with the data as it travels down the pipeline but it should be limited to actions which can do no harm. Human actions which do no harm include: pressing buttons to start running algorithms or other blocks of code, reading and querying data, and exporting data to do manual exploratory or forensic analysis within a data governance framework. The data pipeline for Test & Trace will look something like this:    a patient manually fills out a web-form, which automatically updates a patient listfor each test, the laboratory adds the test result for that patientthe lab sends an Excel file to Public Health England with the ID’s of positive patientsPHE manually transpose the data in the Excel file to the NHS Test & Trace systemthe NHS T&T system pushes each positive patient contact details to NHS T&T agentsfor each positive patient, an NHS T&T contact centre agent phones them. This is a not a single pipeline because in the middle a human being needs to open up an editable file and transpose it into another file. The pipeline is therefore broken, splitting at the point at which the second Excel file is manually created. If you put yourself in the shoes of the person receiving one of these Excel files, you can probably identify several ways in which this manual manipulation of data could lead to harm. And it is not just the data which needs to be moved manually from one side of the broken pipeline to the other side, it is the associated data types, and CSV files can easily lose data type information. This matters. You may have experienced importing or exporting data with an application which changes 06/10/20 to 10/06/20. Patient identifiers should be of data type text, even if they consist only of numbers, for future-proofing. Real numbers represented in exponential format should, obviously, be of a numeric data type. And so on. One final point: the different versions of Excel (between the Pillar 2 laboratories and PHE) are a side-show, because otherwise this implies that had the versions been the same, then everything would be fine. This is wrong. The BBC have today reported that “To handle the problem, PHE is now breaking down the test result data into smaller batches to create a larger number of Excel templates. That should ensure none hit their cap.” This solves the specific Excel incompatibility problem (assuming the process of creating small batches is error-free) but has no bearing on the more fundamental problem of the broken data pipeline, which will stay until the manual Excel manipulation is replaced by a normal and not particularly complex automated process. Curiosity So where does curiosity fit in? The first thing that any Data Analyst does when they receive data is to look at it. This is partly a technical activity, but it is also a question of judgement and it requires an element of curiosity. Does this data look right? What is the range between the earliest and the latest dates? If I graph one measurement over time (in this case positive tests over time), does the line look right? If I graph two variables (such as Day Of Week versus positive tests) what does the scatter chart look like? Better still, if I apply regression analysis to the scatter chart what is the relationship between the two variables and within what bounds of confidence? How does that relate to the forecast? Why? This is not about skills. If I receive raw data in csv format I would open it in a python environment or an SQL database. But anyone given the freedom to use their curiosity can open a csv file in Notepad and see there are actually one million rows of data and not 65,000. Anyone given the freedom to use their curiosity can graph data in Excel to see whether it has strange blips. Anyone given the freedom to use their curiosity can drill down into anomalies. Had those receiving the data from the Pillar 2 laboratories been allowed to focus some of their curiosity at what they were receiving they would have spotted pretty quickly that the 16,000 patient results were missing. As it was, I suspect they were not given that freedom: I suspect they were told to transpose as much data as they could as quickly as possible, for what could possibly go wrong? Single Data Pipeline, Singular Curiosity: Pick At Least One To reiterate, the current problems with T&T would never have arisen with a single data pipeline which excluded any manual manipulation in Excel. But knowing that the data pipeline was broken and manual manipulation was by design part of the solution, the only way to minimise the risk was to encourage people engaged in that manual process to engage their curiosity about the efficacy of the data they were manipulating. In their prototype phases – for that is the status of the T&T application - data projects will sometimes go wrong. But they are much more likely to go wrong if the people involved, at all levels, do not have enough time or freedom to think, to engage their curiosity, and to ask themselves “is this definitely right?” You can view Moray's original article here.  Moray Barclay is an Experienced Data Analyst working in hands-on coding, Big Data analytics, cloud computing and consulting.

2020: The Year of the Data Engineer

Data Engineers are the architects of Data. They lay the foundation businesses use to collect, gather, store, and make Data usable. Each iteration of the Data as it moves along the pipeline is cleaned and analysed to be used by Data professionals for their reports and Machine Learning models. A ROLE IN HIGH DEMAND Even as businesses reopen, reassess, and for some, remain remote, the demand for Data Engineers is high. Computer applications, Data modelling, prediction modelling, Machine Learning, and more need Data professionals to lay the groundwork to help businesses benefit in today’s Data-driven culture. The word gets thrown around a bit, but when the majority of business has moved online, Data-driven is the name of the game. Having a Data plan, a Data team, and all aligned with your business strategy is imperative to the way business is done today. This type of innovation can offer insight for better business decisions, enhance customer engagement, and improve customer retention without missing a beat.  Without Data Engineers, Data Scientists can’t do their jobs. Understanding the amount of Data, the speed at which is delivered, and its variety need Engineers to create reliable and efficient systems. Like many Data professional jobs, even still in 2020, Data Engineers are in high demand. Yet a skills shortage remains. This has created an emerging field of professionals from other backgrounds who are looking to take on the role of Data Engineer and fill the gap. Whether by necessity or design, these individuals build and manage pipelines, automate projects, and see their projects through to the end result. CAREER OPPORTUNITIES OUTSIDE THE NORM As this growing trend emerges, it has created career opportunities for those with experience outside the normal channels of Data Engineering study. While it might involve individuals from backgrounds such as software Engineering, Databases, or something similarly IT-related, some businesses are upskilling their employees with talent. Rapid growth, reskilling, upskilling, and ever-constant changes still leave businesses with a shortage of Data Engineers to meet the demand. It’s critical to fill the gap for success. According to LinkedIn’s 2020 Emerging Jobs Report, Data Engineering is listed in the top 10 of jobs experiencing growth. THREE STEPS TOWARDS BECOMING A DATA ENGINEER This is a vital role in today’s organisations. So, if you’re in the tech industry and want to take a deeper dive into Data as a Data Engineer, what steps can you take? This is a time like no other. There’s time to assess your goals, take online classes, and get hands on with projects. Though having a base of computer science, mathematics, or business-related degree is always a good start. Be well-versed in such popular programming languages such as SQL, Python, R, Hadoop, Spark, and Amazon Web Services (AWS).Prepare for an entry-level role once you have your bachelor’s degree.Consider additional education to stay ahead of the curve. This can include not only professional certifications, but higher education degrees as well. The more experience, hands-on as well as academic, you have the more in demand you’ll be as a Data Engineer. Data scientists might be the rockstars of Data, but Data Engineers set the stage. As business processes have shifted online, looking for your next job has become more daunting than ever before. If you’re looking for your next opportunity in Data, take a look at our current jobs or get in touch with one of our expert consultants to find out more. 

Recently Viewed jobs