So, a few days ago the Numerical Analysis teacher from my university left us with a proyect of coding a mathematical method of solving equations.

At first I thought of doing just a single solution for the problem, but whats the fun I that? So after thinking in funny ways to do my homework (I'm kinda nerd) I ended up doing the solution in C++ and in CUDA programming.

Notice that I really don't know how to code in CUDA, but man it's not real work so It's ok to learn from the mistakes.

This is my first time posting my code, so It would be really nice to read feedback from you guys.

So, this is my approach of the problem in C++

```
void jordan(double** matrix, double **answers, int matrixSize) { //Gauss-Jordan
//clock_t begin = clock();
double temp = 0;
for (int i = 0; i < matrixSize; i++) { //Recorre diagonalmente el arreglo - Goes in a diagonal way throught the matrix
temp = matrix[i][i];
for (int x = 0; x < matrixSize; x++) { //Hace 1 la diagonal ([i][i]) y divide al resto de la fila - Makes 1 the number in [i][i] and divides the row
matrix[i][x] = matrix[i][x] / temp;
}
answers[i][0] = answers[i][0] / temp;
double multiplyFactor = 0;
for (int y = 0; y < matrixSize; y++) {
if (y != i) {
multiplyFactor = matrix[y][i];
for (int x = i; x < matrixSize; x++) { //Recorre la fila haciendo 0 las coordenadas en la columna de [i][i] - Makes 0 the numbers in the column of [i][i]
matrix[y][x] = matrix[y][x] - (matrix[i][x] * multiplyFactor);
}
answers[y][0] = answers[y][0] - (answers[i][0] * multiplyFactor);
}
}
}
```

And this is in CUDA

```
__global__ void jordanGPU(long double ** matrix, long double * answer, int i)
{
int idx = threadIdx.x;
if (idx != i) {
double multiplyFactor = matrix[idx][i];
for (int x = i; x < 350; x++) { //Recorre la fila haciendo 0 las coordenadas en la columna de [i][i]
matrix[idx][x] = matrix[idx][x] - (matrix[i][x] * multiplyFactor);
}
answer[idx] = answer[idx] - (answer[i] * multiplyFactor);
}
}
```

```
clock_t begin = clock();
double temp = 0;
for (int i = 0; i < matrixSize; i++) { //Recorre diagonalmente el arreglo
temp = matrix[i][i];
for (int x = 0; x < matrixSize; x++) { //Hace 0 la diagonal ([i][i]) y divide al resto de la fila
matrix[i][x] = matrix[i][x] / temp;
}
answers[i] = answers[i] / temp;
cudaMemcpy(matrixGPU, matrix, matrixBytes, cudaMemcpyHostToDevice);
cudaMemcpy(answersGPU, answers, answerBytes, cudaMemcpyHostToDevice);
jordanGPU << <1, 350 >> >(matrixGPU, answersGPU, i);
cudaMemcpy(matrix, matrixGPU, matrixBytes, cudaMemcpyDeviceToHost);
cudaMemcpy(answers, answersGPU, answerBytes, cudaMemcpyDeviceToHost);
cudaFree(matrixGPU);
cudaFree(answersGPU);
```

Hope to read recommendations and suggestions about my code from you guys!

Thanks for reading!~

## Discussion

Hi Carlos, first of all, very interesting approach using CUDA. Just wanted to let you know you have a typo when you say "but whats the fun I that?" It should be in that instead of I :)

Omg thats right! That was totally a typo, thanks for reading man! :)

Keep writing! Very interesting stuff you are doing