AI to Match Qualified Candidates with Job Postings

#ai #tooling #datascience #boltonsea

Our current hiring process involves manually reviewing resumes from various platforms to match candidates with job descriptions. This is inefficient given the high volume of applications.

My Proposal

I propose developing an AI workflow to automate this initial screening. The system would match candidates' experience, education, and skills against job requirements, providing hiring managers with a pre-vetted list of qualified candidates. This would streamline the hiring process significantly.

Workflow Components:

Job Board: A full-stack application designed to serve both job seekers (users) and hiring managers.
Content Management System: A system, such as Boltonsea, to effectively manage and organize candidate resumes.
AI Engine: An integration with a large language model provider (e.g., DeepSeek, OpenAI) for intelligent features.

This outlines the process an applicant undergoes when applying for a position on our job board. Upon submitting their resume, which details their experience, education, and skills, a candidate record is created within our database. The original resume document, whether in MS Word or PDF format, is then securely stored in our Content Management System (CMS) for convenient access.

Furthermore, each candidate's resume link (resume URL) is forwarded to our AI engine layer to train or retrain our existing AI model. For optimal efficiency, every resume is first converted into a Markdown format, from which only the essential information is extracted and utilized for training purposes.

This streamlined system is designed to benefit the Hiring Manager by optimizing the job application review process and identifying only the most qualified candidates.

AI Training Script (Python Snippet)

Convert PDF to Markdown

# Processing resume: self.file_path
docling_doc = self.converter.convert(self.file_path).document

# 🔄 Converting to markdown format
text = docling_doc.export_to_markdown()

# ✅ Conversion complete
metadata = {
    "source": self.file_path,
    "format": "resume",
}

yield LCDocument(page_content=text, metadata=metadata)

Train AI Model


def train_resume(resume_url: str):
    OPENAI_API_URL = "https//OPEN_URL"
    index_path = f"{resume_url}_faiss_index"

    # 🔤 Initializing embedding model
    embeddings = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-MiniLM-L6-v2"
    )

    # ✅ Embedding model initialized
    if os.path.exists(index_path):
        print("📦 Loading existing vector store...")
        vectorstore = FAISS.load_local(
            index_path, embeddings, allow_dangerous_deserialization=True
        )
        # ✅ Vector store loaded
    else:
        # 💫 No existing index found. Creating new one...

        loader = DoclingResumeLoader(resume_url)
        documents = loader.load()

        print("\n📄 Splitting document into chunks...")
        text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=1000, chunk_overlap=200, separators=["\n\n", "\n", " ", ""]
        )
        splits = text_splitter.split_documents(documents)
        print(f"✅ Created {len(splits)} chunks")

        print("\n📦 Building vector store and creating embeddings...")
        vectorstore_start = time.time()
        vectorstore = FAISS.from_documents(splits, embeddings)
        vectorstore_time = time.time() - vectorstore_start
        print(f"✅ Vector store built in {vectorstore_time:.2f} seconds")

        print(f"💾 Saving vector store to {index_path}")
        save_start = time.time()
        vectorstore.save_local(index_path)
        save_time = time.time() - save_start
        print(f"✅ Vector store saved in {save_time:.2f} seconds")

    retriever = vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 5})
    print("✅ Vector store ready")

    print("\n🤖 Connecting to local language model...")

    llm = ChatOpenAI(
        model="local-model",
        openai_api_base=OPENAI_API_URL,
        openai_api_key="not-needed",
        temperature=0,
    )

    template = """You are a helpful assistant answering questions about the resume: {book_name}.
    Use the following context to answer the question: {context}
    Question: {question}
    Answer the question accurately and concisely based on the context provided."""

    prompt = PromptTemplate(
        input_variables=["book_name", "context", "question"], template=template
    )

    # ✨ System ready! Total setup took
    return ConversationalRetrievalChain.from_llm(
        llm=llm,
        retriever=retriever,
        return_source_documents=True,
        combine_docs_chain_kwargs={
            "prompt": prompt,
            "document_variable_name": "context",
        },
    )

Ask A.I. "The Question"

Question: Who qualifies for this job with {{description}}

trained_model = train_resume(args.url)
chat_history = []

print("\n Ready to answer questions \n")
print("Type 'quit!' to exit")

while True:
    question = input("\n Ask a question: ")
    if question.lower() == "quit!":
        break

    print("\n ... Processing ... \n")
    result = trained_model.invoke(
        {
            "question": question,
            "chat_history": chat_history,
            "book_name": os.path.basename(args.resume_path),
        }
    )

    print_result(result)
    chat_history.append((question, result["answer"]))

Tools & Dependency

requirements.txt

  langchain-core==0.3.28
  langchain-text-splitters==0.3.4
  langchain_huggingface==0.1.2
  docling==2.14.0
  langchain_community==0.3.13
  langchain_openai==0.2.14
  langchain==0.3.13
  faiss-cpu==1.9.0

Installation: pip install -r requirements.txt

Job Portal (NextJS): Superio is suitable for you to show professional job board websites that require high advanced features to powerful functions and useful services for users
BoltonSea (CMS): BoltonSea is a content management system that offers services related to files, media management and streaming with an APIs and a dashboard UI
Docling (Open Source Document Processing for AI): Docling simplifies document processing with advanced PDF understanding, OCR support, and seamless AI integrations. Parse PDFs, DOCX, PPTX, images & more

DEV Community

AI to Match Qualified Candidates with Job Postings

My Proposal

Workflow Components:

AI Training Script (Python Snippet)

Tools & Dependency

References:

Top comments (0)