DEV Community

Craig Nicol (he/him)
Craig Nicol (he/him)

Posted on • Originally published at craignicol.wordpress.com on

1 1

CodeCraftConf 2019 : What is data anyway? (Questions)

Here are the questions I asked during my guided conversation at CodeCraftConf 2019. They are also available on GitHub if you would like to fork and modify them for your own use. Thankyou to everyone who came to the discussion, I will post a follow-up to discuss some of the interesting answers.

What is data anyway?

Navigating SQL, NoSQL, JSON and how to work with data in a post-RDMS, big-data world

Questions

Data modelling

  1. When designing a system, do you start with the data or the code?
  2. Has the rise of cloud based or non relational data stores changed how we model our data?
  3. Do you need to update your data when the models in the code change? How do you do it?
  4. Does all your data have to have the same shape?
  5. Should the data you expose to the outside world broadly match the data at rest?

Data security

  1. How do you secure your data?
  2. In light of GDPR, How do you ensure you aren’t collecting too much data?
  3. Who has access to your data?
  • Do you know if anyone unauthorised has accessed it?
  1. How do you protect yourself against bad data and trojan data?
  • Bad data = data that is fake, or is used for real world attacks
  • Trojan data = data that can compromise your or your customer’s systems

Ethical data

  1. Can your data be used to discriminate?
  • Can you prove it?
  • Is your data biased?
  • Are you recording hidden correlations? (ZIP code suggests race)
  1. Who owns your data?
  2. What questions aren’t you asking?

Unused questions

  1. What makes data big?
  2. Are you collecting the right data?
  3. Is the data you’re collecting right?
  4. Where is your data?

Technology choices

  1. Do you still have a place for traditional RDBMS?

AWS GenAI LIVE image

How is generative AI increasing efficiency?

Join AWS GenAI LIVE! to find out how gen AI is reshaping productivity, streamlining processes, and driving innovation.

Learn more

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay