DEV Community

masoomjethwa
masoomjethwa

Posted on

Exoplanet Data

The NASA Exoplanet Archive is a database of exoplanets and their host stars. It contains information on over 5,000 confirmed exoplanets and thousands of candidates. The data is collected from a variety of sources and is updated regularly. The archive is a valuable resource for scientists and anyone interested in learning more about exoplanets.
NASA Explonet Data

Sure, I can help you with that. Here is the copyedited text:

Exoplanet data

  • Name: Name of the exoplanet.
  • Mass (MJ): Mass of the exoplanet in terms of Jupiter masses (MJ).
  • Radius (RJ): Radius of the exoplanet in terms of Jupiter radii (RJ).
  • Period (days): Orbital period of the exoplanet in days.
  • Semi-major axis (AU): Semi-major axis of the exoplanet's orbit in astronomical units (AU).
  • Temp: Temperature of the exoplanet in kelvins (K).
  • Discovery method: Method used to discover the exoplanet.
  • Disc. Year: Year of discovery.
  • Distance (ly): Distance from Earth in light years (ly).
  • Host star mass (M☉): Mass of the host star in terms of solar masses (M☉).

Analysis

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt


def clean_value(value):
    if not isinstance(value, str):
        return value
    value = value.replace(',', '')
    if '±' in value:
        return float(value.split('±')[0])


df = pd.read_csv('exoplanets.csv')

# when were they discovered?
year = 'Disc. Year'
max_year = int(df[year].max())
min_year = int(df[year].min())
df[[year]].plot.hist(bins=1 + max_year - min_year, legend=True)

# how were they discovered?
method = 'Discovery method'
df[method].value_counts().plot(kind='pie')

# do we expect to see a correlation between mass and distance?
distance = 'Distance (ly)'
mass = 'Mass (MJ)'
df[distance] = df[distance].apply(clean_value)
df[mass] = df[mass].apply(clean_value)

sns.scatterplot(
    data=df, x=distance, y=mass, hue=method, palette='tab10', legend=False
)
plt.title('Mass vs. Distance')
plt.show()

# how about mass vs radius?
radius = 'Radius (RJ)'
df[radius] = df[radius].apply(clean_value)

sns.scatterplot(
    data=df, y=mass, x=radius, hue=method, palette='tab10', legend=False
)
plt.title('Mass vs. Radius')
plt.show()

# maybe a pairplot will help
sns.pairplot(
    data=df[[mass, radius, distance, method]], diag_kind='kde', palette='tab10'
)
plt.title('Pairplot of exoplanet properties')
plt.show()

# let's look directly at the correlations
correlations = df[[mass, radius, distance, method]].corr()
sns.heatmap(correlations, cmap='coolwarm')
plt.title('Correlations between exoplanet properties')
plt.show()

print(correlations)


Enter fullscreen mode Exit fullscreen mode

Top comments (0)