loading...

Best way to store array type data inside database?

mittalyashu profile image Yashu Mittal ・1 min read

What is the best way to store string of array inside a database column?

I have a tags columns, where I want to store multiple tags.

Now I can think of two possible way to this but not sure which is really effective for large-size data.

1st

Directly store the data in an array of a row.

[ { id: 1, name: 'action' }, { id: 2, name: 'classic' }, { id: 3, 'science fiction' } ]

If I want to add the data I push at the end of the array and for remove I can search it by ID.

2st

Create a separate tags table.

| id | name | post_id | tag_id |
| --- | --- | --- | 2 |
| 1 | 'action' | 342342 | 34 |
| 2 | 'classic' | 32424453 | 432 |
| 3 | 'science fiction' | 32424453 | 34 |

Based on the post_id in the tags table I can get all the tags for that specific post.

NOTE: These are not the actual tags but the reference where; which post is having which tag.


I would like to hear your feedback on these two ways, if you have another way to deal with array data type, please let me know in the comment section below.

Posted on Feb 10 by:

mittalyashu profile

Yashu Mittal

@mittalyashu

I <3 to be a developer and designer.

Discussion

markdown guide
 
CREATE TABLE post (
`post_id` INT NOT NULL PRIMARY KEY,
`title` VARCHAR(50)
);

CREATE TABLE tags (
`tag_id` INT NOT NULL PRIMARY KEY
`tag_name` VARCHAR(20),
);

CREATE TABLE post_tags (
`post_id` INT NOT NULL,
`tag_id` INT NOT NULL,
PRIMARY KEY(`person_id`, `tag_id`)
);

The post_tags table contains the foreign keys of tags and posts.
Something like this (:

post_id tag_id
1 1
1 5
1 3
2 2
2 1
2 3

To retrieve a post and all its tags, you need to JOIN the tables in your queries:

SELECT p.*, t.*
FROM post p
INNER JOIN post_tags pt
ON pt.post_id = p.post_id
INNER JOIN tags t
ON t.tag_id = pt.tag_id

This answer is adapted from How to store arrays in MySQL from StackOverflow.

 

Thank you for sharing the code sample.

But, I want to share multiple tags for single post.

 

To retrieve a post and all its tags, you need to JOIN the tables in your queries

On using the INNER JOIN clause it doesn't get the data if specific post have multiple tag_id

image

SELECT
    *
FROM
    post
INNER JOIN tag
ON post.post_id = tag.post_id;

it only return the data for 3rd row according to the above image.

INNER JOIN tag

Can you see the mistake here?

You need to do the JOIN with the post_tags table (with both references, posts and tags) to work.

Yeah I know that,

I tried with the both with and without post_tag table, it was return 0 rows.

 

This is the best solution on the topic. +1

 

Depends on what tags mean in your app, as in both examples have ids I guess they are part of your domain model. You can have a table which lists all available tags and another one with foreign keys to both posts and tags to list which posts have which tags.

 

As I have mentioned in the article. A post can have multiple tags.

 

So yeah, having a many to many table is one solution. But you can also store them as an array or as JSONB in PostgreSQL. It depends if tags themselves are an entity in your app you want to attach metadata to or are just, well, an array of names you might want to query on (for example to give you the list of posts that have a specific tag)

Yes, tags itself have a separate entity in the app, to explain it in simpler words, this is how the data output I am expecting.

{
  postId: 3453,
  title: "sdf",
  slug: "sfs",
  tags: [
    {
      name: "sdfs",
      id: "34534"
    },
    {
      name: 'sdfs',
      id: "35344"
    }
  ]
}
 

I have seen it stored like this (which is actually Anki)

'\x1f' + [ ...(string[]) ].join('\x1f') + '\x1f'

But if I had to do it more complex, like more fields, and rename-able, I would definitely do it the 2st way.

I wouldn't rely on JSON Object, if I can avoid.

 

Hmm, did you discard creating another table? What was the reason behind?

 

What do you mean by creating another table?

In the second option I am creating another table name tags to store the connection between the posts and tags.

 

I'm sorry, I totally missed. I just though that you would storage the array in a varchar (or any lob type) using JSON o separating it with |.

I would go for this three tables: posts, tags and post_tags (Or any naming you like).

posts --> All the posts
tags --> A master table with all the tags
post_tags --> A table with a post_id and tag_id to do the many to many relationship.

A table with a post_id and tag_id to do the many to many relationship.

You mean a single row of post_tags can contain multiple tags_id column.

I mean something like:

posts

post_id other_value
1 something
2 something
3 something

tags

tag_id tag_text
1 tag1
2 tag2
3 tag3

post_tags

post_id tag_id
1 1
1 2
2 3
3 1
3 2
3 3
 

Personally I would use the second approach but make a link table which can group all the tags together via a foreign key.

Cheers, Scott

 

I didn't understood the part about making a link table?

Could you please elaborate that?

 

So I would have two tables. The first one would contain the a primary key of post_id and the name or that post.

The second table would contain all the individual tags linked to that one post. You could then make a foreign key link to the first table.

Hope that makes sense. But think you’ve done this already in your second approach.

Good job

Right, but using the concept of foreign key we can only link single tab to the post table, not an array of tags. Right?

I might be wrong too, IDK