<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: David Ayres</title>
    <description>The latest articles on DEV Community by David Ayres (@davidayres).</description>
    <link>https://dev.to/davidayres</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1816045%2Fd966df48-0490-4185-9b07-30525c2f7940.jpg</url>
      <title>DEV Community: David Ayres</title>
      <link>https://dev.to/davidayres</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/davidayres"/>
    <language>en</language>
    <item>
      <title>What is Architecture to me?</title>
      <dc:creator>David Ayres</dc:creator>
      <pubDate>Fri, 13 Sep 2024 09:03:50 +0000</pubDate>
      <link>https://dev.to/davidayres/what-is-architecture-to-me-579a</link>
      <guid>https://dev.to/davidayres/what-is-architecture-to-me-579a</guid>
      <description>&lt;p&gt;So for my next article in this series, there was a comment I left hanging in my last one: &lt;/p&gt;

&lt;p&gt;"Yes there are lots and lots of flavours of this; Business, Enterprise, Security, Infrastructure, Application, Principal, Solution and a whole host of others that differ from company to company."&lt;/p&gt;

&lt;p&gt;Which definitely needs to be explored. Officially speaking I'm a "Solution" Architect. Which can pretty much be any/all of the flavours of Architect above, like a &lt;em&gt;smorgasbord&lt;/em&gt; of Architecture responsibility.&lt;/p&gt;

&lt;p&gt;So what is Architecture to me and what does my day to day look like?&lt;/p&gt;

&lt;h1&gt;
  
  
  Solutions Architecture
&lt;/h1&gt;

&lt;p&gt;So, a Solution Architect..... tends to have 1 or more Systems and is responsible for the technical ownership of them. They'll draw some system designs - "boxes and arrows" at what's often called a High Level. This box talks to this box that then talks to this box. All technical detail is abstracted away and it's the simplest view of the system possible, including internal systems and external third party ones (think salesforce, workday, etc). It'll show what the basic flow of data looks like and how that crosses across other Systems and domains.&lt;/p&gt;

&lt;p&gt;They should (I say should as it's different everywhere and for everyone) also have supporting documentation to describe the box (or boxes) they own. I'll do another article on the story of a solution design because there's plenty to talk about "how" solutions should be described.&lt;/p&gt;

&lt;p&gt;Those 2 documents together are key. We are responsible to make sure there's adequate documentation describing what a solution does, how it does it and why it's doing it. That way, anybody who comes along and wants to learn about it, can read the document first and learn all about it without needing to rely on an individual sharing their knowledge. It's critical to abstract a solution away from a person. People move on and leave companies, they forget things, a documented historical artefact of that solution lives forever. Systems are rarely short term, this has to be considered when writing documentation.&lt;/p&gt;

&lt;p&gt;So a Solution Architect needs to be competent in the written word.&lt;/p&gt;

&lt;p&gt;A Solution is also a fairly public component of a company. You might be lucky enough to design something isolated, hidden, that people don't really need to know about. If not, then you'll almost certainly need a third supporting document for your solution. The dreaded presentation. That "PowerPoint" you'll have to run through time and time again, to various members of the business, from Director level downwards. Maybe you are justifying the expense of a project, or sharing it's success. Either way, an Architect has to be able to not only create an engaging and interesting presentation but they also need to deliver that message, in a language the audience will understand. Presentation skills cannot be underestimated. For me, it's something I do weekly although I'm lucky that I like the sound of my own voice but really, it's because I'm always invested in what I'm working on and always willing to discuss it.&lt;/p&gt;

&lt;p&gt;Then another key part of my role is the talking. There's always lots of talking. To help formulate a design;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;There's conversations with the business to understand requirements. These will take place with a Business Analyst to understand "what" is needed. It's not a technical conversation but understanding what problems we need to solve forms the foundation of the Solution. If you can't solve the business problem, you'll never "win".&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Any other Systems I might need to interact with, I'll talk with colleagues in those business/tech domains to understand how we can integrate and share the data. It could be an Architect, a Platform Lead, a Lead Engineer - whoever has that Technical ownership.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There's going to be some sort of a Technical governance group, where the Solution needs to be presented and "approved". They'll give insight from their own experience, ask probing questions to see if I have gaps.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If a design moves towards a Buy decision (more on that in another article) then there will be a super exciting RFP process talking to potential Suppliers and evaluating their offering. Is this fit for purpose, will it meet any NFRs, how locked in might we be with this choice etc.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Finally, there's the company governance. The security team audits and checks. Is GDPR data protected? Have I designed a hole that could be exploited by outside malicious parties? In the past these have been Threat Modelling session reviewed by a Security Architect but each company has their own requirements for this.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At the point a design is "complete" the knowledge share is between the Engineers, Architect and Analyst. It's your classic pizza sized team. To get to that point, there will have been so many touch points for the Architect and Analyst that get abstracted away to make the Engineer's lives easier. You need to be able to talk to people and build up rapport with a whole host of different job roles and personalities. There's a huge social touchpoint graph of conversations to complete a Solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Technical "Stuff"
&lt;/h2&gt;

&lt;p&gt;We've got a High Level diagram but the next level down starts to become more technical. That's where you start seeing an API and a data store being added to a diagram. We still aren't at a super technical level here, we aren't talking code, contracts or specific technologies, it's still very abstract, still boxes and arrows but something looking more like a technical delivery and something relevant to Engineers.&lt;/p&gt;

&lt;p&gt;This enables conversations around integration patterns and how we might share data upstream or downstream. If I know a third party solution is going to be used (more on that in another article) they'll get added to the picture along with their integration components so we can map out the journey of data to more detail than the HLD. The same goes for any internal systems that might need to be integrated too.&lt;/p&gt;

&lt;p&gt;My rule of thumb, rightly or wrongly, is that anything that starts becoming specific to a technology - like an Azure function app, a SQL database or a Service Bus Technology (RabbitMQ) is too much detail for this level. As a reminder, I'm working as a Solution Architect, if you are lucky enough to be a Software/Application Architect you might find yourself being required to do that Technical component diagram and prescribe to Engineers exactly what needs to be built.&lt;/p&gt;

&lt;p&gt;Once that's finished, it's time to engage with an Engineering team. Together with a Lead/Principal/Staff/Senior Engineer(s) the deep down technical design can take place. I'll input my Architecture diagrams, along with non functional requirements around load, volume, data size, performance etc. and together we assess the correct technical components to fulfil the requirements.&lt;/p&gt;

&lt;p&gt;Key decisions tend to be;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Relational Database vs Document Database&lt;/li&gt;
&lt;li&gt;API vs Service Bus&lt;/li&gt;
&lt;li&gt;What parts of the system need to scale and what limits do we need to put in place&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'll lead the design and make sure the technical diagram is completed, although it's a collaborative effort to put it together I have ownership of the artefact to add to my collection.&lt;/p&gt;

&lt;p&gt;Now, something that's pretty important to understand here is that, yes, in isolation I "could" do the technical design myself but if I'm not building it and supporting it, I might make a technology decision that doesn't mesh with the team. It's only "fair" that they get a considerable say in what this looks like under the hood. They'll be the ones supporting it 24/7 while I swan off to the next project.&lt;/p&gt;

&lt;p&gt;Equally, I shouldn't be an end to end detailed technical expert in all the technologies my company has adopted. I need to know enough to hold a conversation about the pros and cons of a technology, be able to assess it against NFRs but when it comes to the inner workings, it's left to the experts.&lt;/p&gt;

&lt;p&gt;An Architect needs wide knowledge across all technologies a company works with, along with many other emerging ones to assess/recommend adoption but not the deep down detailed implementation. It means keeping an eye on the market, reading, following industry experts and trying to stay on top of things. Ultimately though, it's boxes and arrows that are key for me, where this Solution sits within the business and how's it's integrated.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Project Team
&lt;/h2&gt;

&lt;p&gt;My next comments are very specific to the way I like to work and definitely don't gel with how everybody likes to work. When it comes to Architecture you either dictate a Solution and move on, or you stay with it and follow it through to deployment.&lt;/p&gt;

&lt;p&gt;For me, I love being part of project teams and I'm a strong advocate of Agile Architecture. I'll try my best (project capacity dependant) to attend all the agile ceremonies for a team and I've even been known to have my own Architecture stories as part of sprint deliveries. I want to be there, in the thick of it, supporting the team.&lt;/p&gt;

&lt;p&gt;At the end of the day, alongside the Business Analyst, we are the subject matter experts for the Solution, the business value, the "why" something is being built so if we are working embedded within the team any questions or issues that arise can be resolved immediately. This also allows for any tweaks or changes to the design to be addressed as quickly as possible. I'm there to make sure the solution matches the design and that the team deliver to my vision.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;It's getting a bit long now and the aim for all of this was to try and keep it short/snappy. Hopefully it gives you some insight into the value that an Architect brings but to summarise;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We need to be experts in our specific business domain but knowledgeable in other areas (that classic T individual)&lt;/li&gt;
&lt;li&gt;We draw lots of boxes and arrows on pages, a diagram speaks a thousand words&lt;/li&gt;
&lt;li&gt;We need to be able to write clear, concise and engaging documentation&lt;/li&gt;
&lt;li&gt;We talk to &lt;em&gt;A LOT&lt;/em&gt; of people across the business and third parties, communication is key&lt;/li&gt;
&lt;li&gt;We need to be able to present our Solution to all levels of a business&lt;/li&gt;
&lt;li&gt;There's a whole Technical knowledge and understanding that comes with it, we need to be continuous learners&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To round out my list of articles - I'll cover off;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The narrative and artefacts of a Solution&lt;/li&gt;
&lt;li&gt;The Build vs Buy decision&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Which will hopefully give people a rounded view of the value Architecture brings to a business and what Architecture means to me.&lt;/p&gt;

&lt;p&gt;Thanks for your time!&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>solutiondesign</category>
      <category>career</category>
      <category>web</category>
    </item>
    <item>
      <title>Are You Secretly an Architect?</title>
      <dc:creator>David Ayres</dc:creator>
      <pubDate>Tue, 06 Aug 2024 18:20:03 +0000</pubDate>
      <link>https://dev.to/davidayres/are-you-secretly-an-architect-phm</link>
      <guid>https://dev.to/davidayres/are-you-secretly-an-architect-phm</guid>
      <description>&lt;h1&gt;
  
  
  Intro
&lt;/h1&gt;

&lt;p&gt;Although my first article was all about the magic of CSV Schema Validation (check it out if you haven't already!) the majority of my day to day fits firmly into the Solution Architecture remit. I'm sure some code heavy posts will get written (I still have my hobby code, don't tell my boss), consider this Part 1 in a series of very Architecture focused articles where I'm hoping to;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Give some insight into how I've fallen into the role I'm in.&lt;/li&gt;
&lt;li&gt;What it means to me being an Architect.&lt;/li&gt;
&lt;li&gt;The average (or not so average) day to day life of a Solution Architect.&lt;/li&gt;
&lt;li&gt;Insight, advice and a Sales Pitch for those Engineers who are considering making the move across.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then if people find those interesting I'll also drop in some more detailed articles on more focused and role specific topics, especially the dreaded Build vs Buy debate that fills most of my days (I enjoy it really) and did somebody mention non-functional requirements!?&lt;/p&gt;

&lt;h1&gt;
  
  
  The Architect
&lt;/h1&gt;

&lt;p&gt;Of all the roles within the Tech Community, none come with as much inconsistency and widely differing job roles as that of the Technical Architect.&lt;/p&gt;

&lt;p&gt;Yes there are lots and lots of flavours of this; Business, Enterprise, Security, Infrastructure, Application, Principal, Solution and a whole host of others that differ from company to company.&lt;/p&gt;

&lt;p&gt;You'll almost certainly have experienced working with at least one of these mysterious Architects, who often drift into and out of projects, seemingly preaching down from an Ivory Tower on how things should be done. &lt;/p&gt;

&lt;p&gt;Then when they aren't doing that they'll be sighted having hushed conversations with all the "C-Suite" members about top secret initiatives that might be shared months/years later.&lt;/p&gt;

&lt;p&gt;That's certainly the overall consensus of Architecture, one I can be guilty of myself sometimes unfortunately. Some of us don't mean to do it, I promise.....&lt;/p&gt;

&lt;h1&gt;
  
  
  My Journey
&lt;/h1&gt;

&lt;p&gt;So I've pretty stumbled through my career, I did an A-Level in computing because I was sort of good at it and didn't mind it. That then meant going to University and doing Computer Science because again, I wasn't sure on what I wanted to do and it was sort of enjoyable and I was sort of good at it. I was then ejected from University with no real direction, I wasn't prepared for the world of work, either mentally or through my education. I ended up applying for anything I could computer related but found plenty of rejection because of my lack of experience which even to this day, is still the case for many of you.&lt;/p&gt;

&lt;h2&gt;
  
  
  IT Support
&lt;/h2&gt;

&lt;p&gt;Eventually I managed to land a role at a large IT company close to me doing first line IT support in a call centre. A stepping stone to hopefully work my way up through the organisation. As tedious as that role was, I've always spoken positively of the experience and it's helped shaped a lot of my core skills I still use to this day.&lt;/p&gt;

&lt;p&gt;I would be talking to hundreds of people a day, so it forced me to be personable but also how to build rapport with people. Sometimes people would be upset/angry they were having an IT issue and as the "face of the company" I had to try and win them over so we could try and resolve their issue. Typically not the sort of exposure an Engineer has.....&lt;/p&gt;

&lt;p&gt;Coupled with that, I had to try and drill down to the underlying problem a customer was having as quickly as possible. I had to learn techniques and the right questions to ask. We were motivated to fix as many issues as possible with the customer on the phone so it became a bit of an art form.&lt;/p&gt;

&lt;h2&gt;
  
  
  Web Development
&lt;/h2&gt;

&lt;p&gt;I've not been mostly truthful with my experience. While at University I did find myself drawn to Web Development. So while doing my day job, I was honing my skills and dabbling in what I could achieve website wise. I was fortunate enough to get a few private jobs through friends/family and managed to build a bit of a portfolio. Then my stubbornness paid off and my company advertised a role for a trainee web developer which I applied for and got. So I got to write code, learn from peers/senior developers and work on some pretty big sites. What was unique about this role was that I interacted directly with clients. There wasn't a Project Manager or Business Analyst (which today this day still confuses me) instead it was myself and the Team Lead going out to clients, discussing designs, requirements and doing the up sell of what we could deliver for them. I still use the phrase "walking the floor" from that role - which was coined for when the Team Lead and I would visit clients and tour their offices, making contacts and trying to drum up new business. A pretty unique environment but another one that helped shape my core skills. We would solutionise on the fly, so I got very comfortable selling projects to customers.&lt;/p&gt;

&lt;p&gt;I moved through a few other companies, tried my hand at agency work but there was something consistent throughout those roles;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I was always customer facing, selling projects/solutions and learning how to describe something technical in a language my audience would understand.&lt;/li&gt;
&lt;li&gt;I was designing the projects I worked on and naturally leading teams in how we should deliver them.&lt;/li&gt;
&lt;li&gt;Everything would be fast paced, time was money and so decisions needed to be quick and correct first time.&lt;/li&gt;
&lt;li&gt;I found myself writing less code and more taking ownership of what was being delivered/how.&lt;/li&gt;
&lt;li&gt;Project Managers and Business Analysts came into the picture as the industry matured (now I sound old) but I would still be out there with sales directors, pitching for work then helping distil that for the team.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;Then one day, I was at the SDD Conference in London and I took myself off to a talk by Juval Lowy about being an Architect and that was it, a light bulb went off in my head and although I loved writing code, I loved designing systems more. That was my motivation; how could I do something better, quicker, cheaper. How could I meet that client's requirements and design something that'll exceed their expectations. I went back to my company and pretty much talked them into making me an Architect, officially taking me away from the code (although that never happened and there was always scenarios where I had to roll my sleeves up) and changing my job description to what I had been doing already.&lt;/p&gt;

&lt;p&gt;I was raw and had learnt my trade as I went, shaping myself for the companies I worked for so I did some training although to this day, I still don't have any formal qualifications and have a habit of doing things "my way" no matter where I work.&lt;/p&gt;

&lt;p&gt;Which is where the title of this article FINALLY comes back in..... does any of what I enjoyed sound like you? Are you an Engineer that drives more enjoyment from the design and the client interaction than writing the code? Can you talk to a room of people about highly technical topics in a language that anybody listening can understand?&lt;/p&gt;

&lt;p&gt;Then perhaps secretly you are an Architect and maybe there's a different and better role out there for you!&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>solutiondesign</category>
      <category>web</category>
      <category>career</category>
    </item>
    <item>
      <title>CSV Schema Validation</title>
      <dc:creator>David Ayres</dc:creator>
      <pubDate>Mon, 22 Jul 2024 14:30:20 +0000</pubDate>
      <link>https://dev.to/davidayres/csv-schema-validation-1p23</link>
      <guid>https://dev.to/davidayres/csv-schema-validation-1p23</guid>
      <description>&lt;h1&gt;
  
  
  Intro
&lt;/h1&gt;

&lt;p&gt;The humble CSV file; which I'm not going to cover in detail here. If you don't know what a CSV file and were hoping this document would help - I'm more than happy to signpost you to the &lt;a href="https://en.wikipedia.org/wiki/Comma-separated_values" rel="noopener noreferrer"&gt;Wikipedia page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So what's this document actually about then? Well, no matter how much we fight it, the CSV is a heavily used file format to share large amounts of data across integration platforms, which is especially useful when it comes to the always difficult task of integration across distributed third party systems.&lt;/p&gt;

&lt;p&gt;I didn't set out on my career with a strong desire to be an Integration Architect but as most other Solution Architects deal with, we have to wear many hats, so I find myself in a scenario where I'm having to deal with a lot of files, moving between a lot of systems and CSV is a format I have to handle, no matter how much I fight it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;So the good?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CSV is a simple enough format to define; I've got X columns and a delimiter to indicate how to split those columns. Then there's the rows, the many many rows that make up the file. Quick and easy for a no code Integration platform to define. Consuming within a similar platform isn't too difficult and just becomes mostly config. As long as the columns are consistent on all rows you are good to go and can consume the file easily enough.&lt;/p&gt;

&lt;p&gt;You can also zip these files and majorly reduce the file size, which makes it much easier to send/receive. Compression is majorly efficient when it comes to a CSV.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The bad?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Trust. The provider of the file has to be very strict on how they generate their file. They always have to have all columns on each row. If they miss a column, the file can't be imported and that row will have to be rejected. Also when generating the file, that pesky delimiter has to be suitably escaped. If it isn't then it'll generate additional columns which will break the file consumer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ID,Title,Cost
1,Book 1,4.99
2,"Book, Book 2",5.99
3,Book, Book 3,6.99
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Row 3 will have 4 columns if you split the columns on the delimiter: ",". Row 2 should pass because the delimiter is within speech marks which should tell consumers to treat the content between them as 1 column. One hopes that's the case anyway, plus we also hope that any speech marks are suitably escaped.....&lt;/p&gt;

&lt;p&gt;So it's a pretty fragile file format and open to be broken.&lt;/p&gt;

&lt;p&gt;Also a CSV file isn't able to hold any sort of meta data. This means any CSV Interface has to have a document shared by the owner so that the data held within can be understood. It's another level of trust between the provider and consumer. More often than not fields get treated as a string by the consumer as it's less prone to breaking. That's not ideal.&lt;/p&gt;

&lt;p&gt;Then there's the complete lack of any sort of schema to validate the data against. The consumer has to understand the interface document (there's no standard for this at all) and potentially implement custom validation and logic. If you don't have a no code Integration Platform and are writing the code by hand, it can be timely repeating the same effort for each CSV file you consume.&lt;/p&gt;

&lt;h1&gt;
  
  
  CSV Schemas?
&lt;/h1&gt;

&lt;p&gt;You might have looked into a CSV schema to make consuming these files easier, which is how you've probably stumbled across this article. With XML there's XSD and JSON we have JSON Schema, so why doesn't CSV have anything!? The below is what I've stumbled across with a critique of my opinion of them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CSV Schema&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An attempt was made to try and standardise a &lt;a href="https://digital-preservation.github.io/csv-schema/#toc0" rel="noopener noreferrer"&gt;CSV Schema&lt;/a&gt; approach. An unofficial draft was made in &lt;a href="https://digital-preservation.github.io/csv-schema/csv-schema-1.1.html" rel="noopener noreferrer"&gt;2016&lt;/a&gt; but was never formally adopted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CSV On The Web&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://csvw.org/" rel="noopener noreferrer"&gt;CSV on the Web&lt;/a&gt; Is a fairly new attempt to standardise how a CSV schema should be documented. It's been recommended by the UK Government Digital Service but hasn't gotten the traction it might need. The biggest issue is that nobody has pioneered any .NET libraries to implement validation for it. I tried but it became a bigger job than I was hoping for to write my own library for this. It definitely shows promise and if it matures and there's more industry adoption, this would be a winner.&lt;/p&gt;

&lt;p&gt;Ultimately, CSV is the forgotten file format of the digital age and doesn't get as much love or attention as it deserves.&lt;/p&gt;

&lt;h1&gt;
  
  
  My Use Case
&lt;/h1&gt;

&lt;p&gt;As I've mentioned, I deal with a lot of CSV integrations, I don't have the privilege of a no code Integration Platform so I wanted to have a schema for my CSV files; to validate against when I'm generating files, or to use when consuming them. A quick schema check ahead of publishing and consuming data helps the end to end Integration process and improves data quality across the journey. There's been too many issues where bad files have been shared, which trigger support tickets and ultimately cost somebody time to look into and debug issues.&lt;/p&gt;

&lt;p&gt;I've managed to cobble together an approach for CSV validation that's proven to be fast, scalable and has managed to handle even the weirdest of file content I have to deal with on a daily basis, hopefully it's useful to somebody!&lt;/p&gt;

&lt;h1&gt;
  
  
  CSV Schema Validation Tool
&lt;/h1&gt;

&lt;p&gt;It's rare these days to have the pleasure of writing your own unique piece of software from scratch but validating a CSV file in .NET, with no packages I could just download and use, felt like I was pioneering something so I rolled up my sleeves and got stuck in!&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1; JSON Schema notation.
&lt;/h2&gt;

&lt;p&gt;If you haven't been lucky enough to work with JSON Schema, then check it out &lt;a href="https://json-schema.org/" rel="noopener noreferrer"&gt;here&lt;/a&gt;. It's mature, well supported, feature rich and a staple when defining JSON file formats and API responses. Why not make this work for a CSV?&lt;/p&gt;

&lt;p&gt;This was the easy part. A CSV header doesn't take a huge amount of work to be written as a JSON schema. With not too much work I was able to define a schema that has all the wonderfulness like variable types, lengths, I can do REGEX patterns (who doesn't love a REGEX!?), enums and even better some of the built in JSON Schema field types like email formatting. Excellent.&lt;/p&gt;

&lt;p&gt;To support a CSV file, I've introduced a couple of custom values that need to be added to the JSON Schema. All CSV files have a delimiter, so that's a mandatory field. Also (frustratingly) a CSV file doesn't have to have a header row, so that's a second mandatory field that needs to be added to the schema.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "$id": "https://example.com/person.schema.json",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Person",
  "type": "object",
  "csvDelimiter": ",",
  "csvHasHeader": "true",
  "properties": { }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2; CSV to JSON.
&lt;/h2&gt;

&lt;p&gt;Now, as I've mentioned above, I work with the Microsoft stack so we can now start delving into some code.&lt;/p&gt;

&lt;p&gt;The first thing to do is read in the JSON schema file. To understanding how to read the CSV file, we need that header and delimiter meta data. After some research, I settled on &lt;a href="https://jsonschema.net/" rel="noopener noreferrer"&gt;JsonSchema.net&lt;/a&gt; to read and parse my Schema files. Once the file is read, there's some validation checks to make sure the delimiter and header fields are present. If not, we reject the files as that's mandatory metadata.&lt;/p&gt;

&lt;p&gt;The second thing to do is consume the CSV into the application and to do that, for years I've advocated for the NuGet package: &lt;a href="https://joshclose.github.io/CsvHelper/" rel="noopener noreferrer"&gt;CSV Helper&lt;/a&gt;. It's been around for a very long time and for good reason. It'll read a CSV file very quickly and comes with a "dynamic" type, so it's simple enough to generically read in a CSV file into a collection. During the read process, we pass in the delimiter value and if the file has a header or not, it does all the hard work for us!&lt;/p&gt;

&lt;p&gt;One thing I love about CSV Helper is that it handles all the special characters for you, even a carriage return within double quotes;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;firstName,lastName,age
John,Fish,5
David,"Cr
ab",22
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the above will still produce 2 rows of data, with 3 columns but the lastName for David would be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Cr
ab
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Which look strange but is exactly what we are expecting.&lt;/p&gt;

&lt;p&gt;** It's slowly coming together nicely..... **&lt;/p&gt;

&lt;p&gt;The final step of the puzzle is to convert a CSV file into a JSON file. This is the slightly less elegant part of the solution.&lt;/p&gt;

&lt;p&gt;We loop through the CSV collection building up a dictionary of &lt;code&gt;&amp;lt;string,object&amp;gt;&lt;/code&gt;. The string part is the column name that we extract from the validated schema file. We simply take each column in the CSV row and pull out the positional field name from the schema.&lt;/p&gt;

&lt;p&gt;Of everything this feels a little "hacky" but there's no other way to associate the column name to the CSV. As column positioning is critical in a CSV, this approach simply takes advantage of that. If the schema column order doesn't match the CSV column order, everything will fall over and throw validation errors but I deemed this acceptable due to the behaviour of CSV.&lt;/p&gt;

&lt;p&gt;The second thing we do here is make sure that the data from the CSV column is parsed and stored in the dictionary as the correct type. JsonSchema.net has built in enums for Schema Value Types, so we take the field type from the Schema and parse the value from the CSV. Now we've got a nicely formatted dictionary! If we didn't properly parse the data, the schema validation would fail.&lt;/p&gt;

&lt;p&gt;The final step in transforming the CSV to JSON is to take the Dictionary and pass it through the &lt;code&gt;System.Text.Json JsonSerializer&lt;/code&gt; so it becomes a JSON friendly string, then we parse it into a &lt;code&gt;JsonDocument&lt;/code&gt; using &lt;code&gt;JsonDocument.ParseAsync&lt;/code&gt; for the code to then treat it as a valid JSON Document. Remember, we are doing this row by row so each row is treated like it's own individual JSON document. The reason for this is that it gives us a line by line schema validation result so that consumers can be directed to specific faults. It also means that the JSON Schema file is written as if it's one simple row, which makes the notation simpler and easier to understand. This could be something to revisit for a V2.&lt;/p&gt;

&lt;p&gt;The final step is to pass that JSON document into the Schema validator library and out come some results! That's the easy part. It returns some results, so we check through these and add any errors to a general error object that can be utilised by a system calling this library and hey presto - we can use JSON Schema to parse a CSV file!&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance
&lt;/h2&gt;

&lt;p&gt;Without spending too much time on optimisation, it can handle a good 500,000 rows in a couple of seconds. Given these are slow moving files being validated as part of larger integration journey, that's acceptable performance. I'm sure it can be optimised further.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Quirks
&lt;/h2&gt;

&lt;p&gt;All companies have Technical Debt and ghosts from past decisions that haunt them. Recently for me that's been multiple different CSV formats within the same file. Eurgh. However, with this tool we can handle these scenarios.&lt;/p&gt;

&lt;p&gt;There would need to be a JSON Schema file for each row format, then a way to identify what type of row is being processed. Then instead of validating a file and schema combination, we can validate passing in a row and schema one at a time. Doing it this way means we have to handle any error messaging slightly differently but with a little bit of pre-processing of the data, it becomes a trivial hurdle.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;Hopefully this has been useful and you've been able to solve a problem you've got! My approach is still in it's infancy but is already being tested in a Production environment.&lt;/p&gt;

&lt;p&gt;For those of you interested in the Source Code, it's available &lt;a href="https://github.com/DavidAyresAsos/csv-schema-validation" rel="noopener noreferrer"&gt;here&lt;/a&gt;. Feel free to do a PR if you can see improvements. A reminder I'm a hobby engineer these days so there's definitely room for improvement!&lt;/p&gt;

</description>
      <category>csv</category>
      <category>dotnet</category>
      <category>jsonschema</category>
      <category>nuget</category>
    </item>
  </channel>
</rss>
