Background
Once upon a time, I need to consume list of users data from an api which uses json response. As I predicted earlier, the api provider creates a pagination responses when the consumer interacts with list. Unfortunately I need to consume all users which means I have to hit every possible pagination query parameter to fetch all users.
here's the api endpoint
.../api/v2/users?page=1&per_page=4
and here's the api response
{
"meta": {
"page": 1,
"per_page": 5,
"total_pages": 100
},
"data": [
{
"id": "ef5b4299-b8d0-40c2-8442-2fdc90e342ae",
"firstname": "Cale",
"lastname": "Schmidt",
"email": "Liana8@gmail.com"
},
{
"id": "c0328632-1f18-4cf6-ad53-17278ffcc7bb",
"firstname": "Norval",
"lastname": "Feest",
"email": "Eleanora.Gleason55@gmail.com"
},
{
"id": "0c467516-444f-4d01-b40d-82f7a646820a",
"firstname": "Muriel",
"lastname": "Berge",
"email": "Clarissa_Simonis44@hotmail.com"
},
{
"id": "2942a4b0-eac4-463c-87e3-6c7d0406cc81",
"firstname": "Cletus",
"lastname": "Considine",
"email": "Casper.Kris77@gmail.com"
},
{
"id": "2942a4b0-eac4-463c-87e3-6c7d0406cc81",
"firstname": "Cletus",
"lastname": "Considine",
"email": "Casper.Kris77@gmail.com"
}
]
}
First thing that comes to my mind when i face this problem is i go to first page and get last page information and then i will iterate to reach every possible page and merge every user from api response. To improve performance then I need to do it asynchronously for each page. As a result the user data I manage to merge is not sequential and I'm ok with that.
Code
I developed a module which provide the functionality as i mention earlier. You can get it in this repository.
Usage
Create a struct that representing json response which bind api response to a static data type.
type Meta struct {
Page int `json:"page"`
PerPage int `json:"per_page"`
TotalPages int `json:"total_pages"`
}
type User struct {
Uuid string `json:"id" gorm:"column:id;primaryKey"`
FirstName string `json:"firstname" gorm:"column:firstname"`
LastName string `json:"lastname" gorm:"column:lastname"`
Email string `json:"email" gorm:"column:email"`
}
type Response struct {
Meta Meta `json:"meta"`
User []User `json:"data"`
}
func (resp *Response) GetBoundary() int {
return resp.Meta.TotalPages
}
In case you're asking about tag "gorm" , yes i will use it.
Then i create module instance with some configurations like this
type Meta struct {
Page int `json:"page"`
PerPage int `json:"per_page"`
TotalPages int `json:"total_pages"`
}
type User struct {
Uuid string `json:"id" gorm:"column:id;primaryKey"`
FirstName string `json:"firstname" gorm:"column:firstname"`
LastName string `json:"lastname" gorm:"column:lastname"`
Email string `json:"email" gorm:"column:email"`
}
type Response struct {
Meta Meta `json:"meta"`
User []User `json:"data"`
}
func (resp *Response) GetBoundary() int {
return resp.Meta.TotalPages
}
func main() {
var jsonPages Response
pag, err := paginationaggregator.NewPaginationAggregator(
&paginationaggregator.PaginationAggregatorConfig{
URL:"http://127.0.0.1:3000/api/v2/users?per_page=5&page=%d",
Client: &http.Client{},
JsonPage: &jsonPages,
Concurrent: 5,
},
)
}
As you can see, the library provides Concurrent
configuration. This configuration means if you need to make 100 requests for each page then there will be 20 batch asynchronous requests or each batch will make 5 asynchronous requests until all page fetched. I think it will prevent api server considers all these requests as DDOS attacks and prevents requests from hitting rate limiters.
Insert user to database
After successfully retrieving all user then I need to insert it into database only if user email does not exist. Rather than waiting for all user successfully retrieved, the library provides a function so we can process it in each batch.
type Meta struct {
Page int `json:"page"`
PerPage int `json:"per_page"`
TotalPages int `json:"total_pages"`
}
type User struct {
Uuid string `json:"id" gorm:"column:id;primaryKey"`
FirstName string `json:"firstname" gorm:"column:firstname"`
LastName string `json:"lastname" gorm:"column:lastname"`
Email string `json:"email" gorm:"column:email"`
}
type Response struct {
Meta Meta `json:"meta"`
User []User `json:"data"`
}
func (resp *Response) GetBoundary() int {
return resp.Meta.TotalPages
}
var Mysql *gorm.DB
func main() {
var jsonPages Response
Mysql = connectMysql()
defer func() {
sql, _ := Mysql.DB()
sql.Close()
}()
pag, err := paginationaggregator.NewPaginationAggregator(
&paginationaggregator.PaginationAggregatorConfig{
URL:"http://127.0.0.1:3000/api/v2/users?per_page=5&page=%d",
Client: &http.Client{},
JsonPage: &jsonPages,
Concurrent: 5,
ConcurrentBatch: processingBatch,
},
)
if err != nil {
fmt.Println(err)
return
}
res, err := pag.Get()
if err != nil {
fmt.Println(err)
return
}
for _, val := range res {
fmt.Println(val.Response.Data)
}
}
// select and insert user for every batch
func processingBatch(
batchResult []paginationaggregator.HttpInteraction
) error {
var newUsers []User
for _, val := range batchResult {
resp := Response{}
if err := json.Unmarshal([]byte(val.Response.Data),
&resp); err != nil {
continue
}
for _, user := range resp.User {
userTmp := User{}
// check existing user by email
if err := Mysql.Where("email = ?",
user.Email).First(&userTmp).Error; err != nil {
if errors.Is(err, gorm.ErrRecordNotFound) {
// users not exists so append it
newUsers = append(newUsers, user)
}
continue
}
}
}
if err := Mysql.Create(&newUsers).Error; err != nil {
return err
}
return nil
}
func connectMysql() *gorm.DB {
dsn := "root:yourpassword@tcp(127.0.0.1:3306)/peoples"
DB, err := gorm.Open(mysql.Open(dsn), &gorm.Config{
Logger: logger.Default.LogMode(logger.Silent),
})
if err != nil {
panic(err)
}
return DB
}
Top comments (0)