An Improved Framework of
Tree-Structured Data Mining for
Business Process Logs
Analysis
Izwan Nizal Mohd Shaharanee
Jastini Mohd Jamil
UNIVERSITI UTARA MALAYSIA
Introduction
Business process
| organized |
measured
arrangement | of
activities
Complex
business process
required an
automated and
systematic
process
management
Business
Process Model
(BPM) is a
method,
techniques and
tools that
support the
design,
management and
analysis of
operational
business
process.
BPM were
supported in e-
Mail
applications,
inventory
software,
account system
ect.
Introduction
Business
process are
import for
process
optimization
and
improvement
Examples:
University
Database,
Business
Workflows and
many more
Knowledge
from these
processes can
be a great
insight for
better decision
making process
Instruction
detection, audit
trails, crime
surveillance,
fraud
transaction are
several
potential area
for business
process mining
RELATED WORKS
Business process leaves traces and event logs
Give valuable information about the process captured
Process discovery task to discover best workflow for business process
Log data stored in XML format(MXML/XES)
The frequent subtree mining methods are the basis for discovering
interesting associations among tree-structured data objects in XML data
RELATED WORKS
The utilization of subtree mining in process log data is
less explored
A subtree would capture the workflow execution pattern
over the business process
The subtree patterns also preserve the context of the
events and event attributes within a trace
Problem Definitions
• Represented not in
user-friendly format.
Large Number of
Event
Logs/Business
Process Data:
• Large and
complex
Repetitive Events
• Conventional Storing ApproachHidden Knowledge
Problem Statement
Event logs and business process are often very
large in terms of the number of records and they
are usually represented in a format that is not
user-friendly. Hence, sophisticated, robust
methods and tools are needed to extract hidden
knowledge from event data to optimize or
improve business processes
The Objective
In this research paper, a direct mining approach
for XML-based process logs were investigated
and an improved framework is proposed
The RESEARCH FRAMEWORK
The PHASES
Pre-Processing
DSM Extraction
Knowledge Discovery
Intepretation
Phase 1:
Pre-Processing
Raw Process Data
ETL (Extract,
Transform, Load)
XES/MML
Grouping,
Discretization,
Filtering Transformation
Phase II :
DSM Extraction
DSM(Documents Structure
Model)
FDT(Flat Data Table)
DSM – transforming tree data into flat
representation
Extract the most general tree
structure (every instances
matches)
Become the first rows
Remaining instances is placed
based on label, backtrack and non-
existance
Phase III :
Knowledge Discovery
Knowledge Discovery
based on FDT
Frequent Subtree
Mining
Prediction Clustering
Phase IV :
Interpretation
Mapping Knowledge
Model to DSM
Database Structure
Model (DSM)
Regenerate Subtree
Interpretation
The CONCLUSION
• How the proposed framework works ?
Business Logs
Data (Tree-
Structured Format)
Extract their
Document
Structure Model
(DSM)
Business Logs
Data in Flat Data
Format
Mining the DataObtain Knowledge
Mapping the
Knowledge to
DSM
Interpret the
Knowledge
The CONCLUSION
• Business process logs exist in many areas
• The capabilities of representing process logs into
XML format offer a better data analysis
• In this paper XML–based business process logs
were presented into flat data format
• Thus, offer many classical data mining techniques
to be utilized

An Improved Framework of Tree-Structured Data Mining for Business Process Logs Analysis

  • 1.
    An Improved Frameworkof Tree-Structured Data Mining for Business Process Logs Analysis Izwan Nizal Mohd Shaharanee Jastini Mohd Jamil UNIVERSITI UTARA MALAYSIA
  • 2.
    Introduction Business process | organized| measured arrangement | of activities Complex business process required an automated and systematic process management Business Process Model (BPM) is a method, techniques and tools that support the design, management and analysis of operational business process. BPM were supported in e- Mail applications, inventory software, account system ect.
  • 3.
    Introduction Business process are import for process optimization and improvement Examples: University Database, Business Workflowsand many more Knowledge from these processes can be a great insight for better decision making process Instruction detection, audit trails, crime surveillance, fraud transaction are several potential area for business process mining
  • 4.
    RELATED WORKS Business processleaves traces and event logs Give valuable information about the process captured Process discovery task to discover best workflow for business process Log data stored in XML format(MXML/XES) The frequent subtree mining methods are the basis for discovering interesting associations among tree-structured data objects in XML data
  • 5.
    RELATED WORKS The utilizationof subtree mining in process log data is less explored A subtree would capture the workflow execution pattern over the business process The subtree patterns also preserve the context of the events and event attributes within a trace
  • 6.
    Problem Definitions • Representednot in user-friendly format. Large Number of Event Logs/Business Process Data: • Large and complex Repetitive Events • Conventional Storing ApproachHidden Knowledge
  • 7.
    Problem Statement Event logsand business process are often very large in terms of the number of records and they are usually represented in a format that is not user-friendly. Hence, sophisticated, robust methods and tools are needed to extract hidden knowledge from event data to optimize or improve business processes
  • 8.
    The Objective In thisresearch paper, a direct mining approach for XML-based process logs were investigated and an improved framework is proposed
  • 9.
  • 10.
  • 11.
    Phase 1: Pre-Processing Raw ProcessData ETL (Extract, Transform, Load) XES/MML Grouping, Discretization, Filtering Transformation
  • 12.
    Phase II : DSMExtraction DSM(Documents Structure Model) FDT(Flat Data Table)
  • 13.
    DSM – transformingtree data into flat representation Extract the most general tree structure (every instances matches) Become the first rows Remaining instances is placed based on label, backtrack and non- existance
  • 14.
    Phase III : KnowledgeDiscovery Knowledge Discovery based on FDT Frequent Subtree Mining Prediction Clustering
  • 15.
    Phase IV : Interpretation MappingKnowledge Model to DSM Database Structure Model (DSM) Regenerate Subtree Interpretation
  • 16.
    The CONCLUSION • Howthe proposed framework works ? Business Logs Data (Tree- Structured Format) Extract their Document Structure Model (DSM) Business Logs Data in Flat Data Format Mining the DataObtain Knowledge Mapping the Knowledge to DSM Interpret the Knowledge
  • 17.
    The CONCLUSION • Businessprocess logs exist in many areas • The capabilities of representing process logs into XML format offer a better data analysis • In this paper XML–based business process logs were presented into flat data format • Thus, offer many classical data mining techniques to be utilized