DEV Community

loading...

Best way to store array type data inside database?

mittalyashu profile image Yashu Mittal ・1 min read

What is the best way to store string of array inside a database column?

I have a tags columns, where I want to store multiple tags.

Now I can think of two possible way to this but not sure which is really effective for large-size data.

1st

Directly store the data in an array of a row.

[ { id: 1, name: 'action' }, { id: 2, name: 'classic' }, { id: 3, 'science fiction' } ]

If I want to add the data I push at the end of the array and for remove I can search it by ID.

2st

Create a separate tags table.

| id | name | post_id | tag_id |
| --- | --- | --- | 2 |
| 1 | 'action' | 342342 | 34 |
| 2 | 'classic' | 32424453 | 432 |
| 3 | 'science fiction' | 32424453 | 34 |

Based on the post_id in the tags table I can get all the tags for that specific post.

NOTE: These are not the actual tags but the reference where; which post is having which tag.


I would like to hear your feedback on these two ways, if you have another way to deal with array data type, please let me know in the comment section below.

Discussion (21)

pic
Editor guide
Collapse
flrnd profile image
Florian Rand • Edited
CREATE TABLE post (
`post_id` INT NOT NULL PRIMARY KEY,
`title` VARCHAR(50)
);

CREATE TABLE tags (
`tag_id` INT NOT NULL PRIMARY KEY
`tag_name` VARCHAR(20),
);

CREATE TABLE post_tags (
`post_id` INT NOT NULL,
`tag_id` INT NOT NULL,
PRIMARY KEY(`person_id`, `tag_id`)
);
Enter fullscreen mode Exit fullscreen mode

The post_tags table contains the foreign keys of tags and posts.
Something like this (:

post_id tag_id
1 1
1 5
1 3
2 2
2 1
2 3

To retrieve a post and all its tags, you need to JOIN the tables in your queries:

SELECT p.*, t.*
FROM post p
INNER JOIN post_tags pt
ON pt.post_id = p.post_id
INNER JOIN tags t
ON t.tag_id = pt.tag_id
Enter fullscreen mode Exit fullscreen mode

This answer is adapted from How to store arrays in MySQL from StackOverflow.

Collapse
mittalyashu profile image
Yashu Mittal Author

Thank you for sharing the code sample.

But, I want to share multiple tags for single post.

Collapse
flrnd profile image
Florian Rand

To retrieve a post and all its tags, you need to JOIN the tables in your queries

Thread Thread
mittalyashu profile image
Yashu Mittal Author • Edited

On using the INNER JOIN clause it doesn't get the data if specific post have multiple tag_id

image

SELECT
    *
FROM
    post
INNER JOIN tag
ON post.post_id = tag.post_id;

it only return the data for 3rd row according to the above image.

Thread Thread
flrnd profile image
Florian Rand
INNER JOIN tag

Can you see the mistake here?

You need to do the JOIN with the post_tags table (with both references, posts and tags) to work.

Thread Thread
mittalyashu profile image
Yashu Mittal Author

Yeah I know that,

I tried with the both with and without post_tag table, it was return 0 rows.

Collapse
hassanfarid profile image
Hassan Farid

This is the best solution on the topic. +1

Collapse
rhymes profile image
rhymes

Depends on what tags mean in your app, as in both examples have ids I guess they are part of your domain model. You can have a table which lists all available tags and another one with foreign keys to both posts and tags to list which posts have which tags.

Collapse
mittalyashu profile image
Yashu Mittal Author • Edited

As I have mentioned in the article. A post can have multiple tags.

Collapse
rhymes profile image
rhymes

So yeah, having a many to many table is one solution. But you can also store them as an array or as JSONB in PostgreSQL. It depends if tags themselves are an entity in your app you want to attach metadata to or are just, well, an array of names you might want to query on (for example to give you the list of posts that have a specific tag)

Thread Thread
mittalyashu profile image
Yashu Mittal Author

Yes, tags itself have a separate entity in the app, to explain it in simpler words, this is how the data output I am expecting.

{
  postId: 3453,
  title: "sdf",
  slug: "sfs",
  tags: [
    {
      name: "sdfs",
      id: "34534"
    },
    {
      name: 'sdfs',
      id: "35344"
    }
  ]
}
Collapse
patarapolw profile image
Pacharapol Withayasakpunt • Edited

I have seen it stored like this (which is actually Anki)

'\x1f' + [ ...(string[]) ].join('\x1f') + '\x1f'

But if I had to do it more complex, like more fields, and rename-able, I would definitely do it the 2st way.

I wouldn't rely on JSON Object, if I can avoid.

Collapse
smartcodinghub profile image
Oscar

Hmm, did you discard creating another table? What was the reason behind?

Collapse
mittalyashu profile image
Yashu Mittal Author

What do you mean by creating another table?

In the second option I am creating another table name tags to store the connection between the posts and tags.

Collapse
smartcodinghub profile image
Oscar

I'm sorry, I totally missed. I just though that you would storage the array in a varchar (or any lob type) using JSON o separating it with |.

I would go for this three tables: posts, tags and post_tags (Or any naming you like).

posts --> All the posts
tags --> A master table with all the tags
post_tags --> A table with a post_id and tag_id to do the many to many relationship.

Thread Thread
mittalyashu profile image
Yashu Mittal Author

A table with a post_id and tag_id to do the many to many relationship.

You mean a single row of post_tags can contain multiple tags_id column.

Thread Thread
smartcodinghub profile image
Oscar

I mean something like:

posts

post_id other_value
1 something
2 something
3 something

tags

tag_id tag_text
1 tag1
2 tag2
3 tag3

post_tags

post_id tag_id
1 1
1 2
2 3
3 1
3 2
3 3
Collapse
scottandrews98 profile image
Scott Andrews

Personally I would use the second approach but make a link table which can group all the tags together via a foreign key.

Cheers, Scott

Collapse
mittalyashu profile image
Yashu Mittal Author

I didn't understood the part about making a link table?

Could you please elaborate that?

Collapse
scottandrews98 profile image
Scott Andrews

So I would have two tables. The first one would contain the a primary key of post_id and the name or that post.

The second table would contain all the individual tags linked to that one post. You could then make a foreign key link to the first table.

Hope that makes sense. But think you’ve done this already in your second approach.

Good job

Thread Thread
mittalyashu profile image
Yashu Mittal Author

Right, but using the concept of foreign key we can only link single tab to the post table, not an array of tags. Right?

I might be wrong too, IDK