DEV Community

Cover image for AI to Match Qualified Candidates with Job Postings
Mike
Mike

Posted on

AI to Match Qualified Candidates with Job Postings

Our current hiring process involves manually reviewing resumes from various platforms to match candidates with job descriptions. This is inefficient given the high volume of applications.

My Proposal

I propose developing an AI workflow to automate this initial screening. The system would match candidates' experience, education, and skills against job requirements, providing hiring managers with a pre-vetted list of qualified candidates. This would streamline the hiring process significantly.

Workflow Components:

  • Job Board: A full-stack application designed to serve both job seekers (users) and hiring managers.
  • Content Management System: A system, such as Boltonsea, to effectively manage and organize candidate resumes.
  • AI Engine: An integration with a large language model provider (e.g., DeepSeek, OpenAI) for intelligent features.

AI Powered Job Board WorkFlow

This outlines the process an applicant undergoes when applying for a position on our job board. Upon submitting their resume, which details their experience, education, and skills, a candidate record is created within our database. The original resume document, whether in MS Word or PDF format, is then securely stored in our Content Management System (CMS) for convenient access.

Furthermore, each candidate's resume link (resume URL) is forwarded to our AI engine layer to train or retrain our existing AI model. For optimal efficiency, every resume is first converted into a Markdown format, from which only the essential information is extracted and utilized for training purposes.

This streamlined system is designed to benefit the Hiring Manager by optimizing the job application review process and identifying only the most qualified candidates.

AI Training Script (Python Snippet)

Convert PDF to Markdown

# Processing resume: self.file_path
docling_doc = self.converter.convert(self.file_path).document

# πŸ”„ Converting to markdown format
text = docling_doc.export_to_markdown()

# βœ… Conversion complete
metadata = {
    "source": self.file_path,
    "format": "resume",
}

yield LCDocument(page_content=text, metadata=metadata)

Enter fullscreen mode Exit fullscreen mode

Train AI Model


def train_resume(resume_url: str):
    OPENAI_API_URL = "https//OPEN_URL"
    index_path = f"{resume_url}_faiss_index"

    # πŸ”€ Initializing embedding model
    embeddings = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-MiniLM-L6-v2"
    )

    # βœ… Embedding model initialized
    if os.path.exists(index_path):
        print("πŸ“¦ Loading existing vector store...")
        vectorstore = FAISS.load_local(
            index_path, embeddings, allow_dangerous_deserialization=True
        )
        # βœ… Vector store loaded
    else:
        # πŸ’« No existing index found. Creating new one...

        loader = DoclingResumeLoader(resume_url)
        documents = loader.load()

        print("\nπŸ“„ Splitting document into chunks...")
        text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=1000, chunk_overlap=200, separators=["\n\n", "\n", " ", ""]
        )
        splits = text_splitter.split_documents(documents)
        print(f"βœ… Created {len(splits)} chunks")

        print("\nπŸ“¦ Building vector store and creating embeddings...")
        vectorstore_start = time.time()
        vectorstore = FAISS.from_documents(splits, embeddings)
        vectorstore_time = time.time() - vectorstore_start
        print(f"βœ… Vector store built in {vectorstore_time:.2f} seconds")

        print(f"πŸ’Ύ Saving vector store to {index_path}")
        save_start = time.time()
        vectorstore.save_local(index_path)
        save_time = time.time() - save_start
        print(f"βœ… Vector store saved in {save_time:.2f} seconds")

    retriever = vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 5})
    print("βœ… Vector store ready")

    print("\nπŸ€– Connecting to local language model...")

    llm = ChatOpenAI(
        model="local-model",
        openai_api_base=OPENAI_API_URL,
        openai_api_key="not-needed",
        temperature=0,
    )

    template = """You are a helpful assistant answering questions about the resume: {book_name}.
    Use the following context to answer the question: {context}
    Question: {question}
    Answer the question accurately and concisely based on the context provided."""

    prompt = PromptTemplate(
        input_variables=["book_name", "context", "question"], template=template
    )

    # ✨ System ready! Total setup took
    return ConversationalRetrievalChain.from_llm(
        llm=llm,
        retriever=retriever,
        return_source_documents=True,
        combine_docs_chain_kwargs={
            "prompt": prompt,
            "document_variable_name": "context",
        },
    )

Enter fullscreen mode Exit fullscreen mode

Ask A.I. "The Question"

Question: Who qualifies for this job with {{description}}

trained_model = train_resume(args.url)
chat_history = []

print("\n Ready to answer questions \n")
print("Type 'quit!' to exit")

while True:
    question = input("\n Ask a question: ")
    if question.lower() == "quit!":
        break

    print("\n ... Processing ... \n")
    result = trained_model.invoke(
        {
            "question": question,
            "chat_history": chat_history,
            "book_name": os.path.basename(args.resume_path),
        }
    )

    print_result(result)
    chat_history.append((question, result["answer"]))
Enter fullscreen mode Exit fullscreen mode

Tools & Dependency

  • requirements.txt
  langchain-core==0.3.28
  langchain-text-splitters==0.3.4
  langchain_huggingface==0.1.2
  docling==2.14.0
  langchain_community==0.3.13
  langchain_openai==0.2.14
  langchain==0.3.13
  faiss-cpu==1.9.0
Enter fullscreen mode Exit fullscreen mode

Installation: pip install -r requirements.txt

  • Job Portal (NextJS): Superio is suitable for you to show professional job board websites that require high advanced features to powerful functions and useful services for users

  • BoltonSea (CMS): BoltonSea is a content management system that offers services related to files, media management and streaming with an APIs and a dashboard UI

  • Docling (Open Source Document Processing for AI): Docling simplifies document processing with advanced PDF understanding, OCR support, and seamless AI integrations. Parse PDFs, DOCX, PPTX, images & more

References:

Top comments (0)