FUZZY DEFINITION FOR TASKS AND RESOURCES CATEGORIZATION IN SOFTWARE PROJECTS

: Risk in software development process is an event that if allowed to occur can cause a major hazard in the development process and subsequently, deterioration in software quality. Risk reduction has been a major concern to software project stakeholders. A major way of preventing risks is to get the tasks and resource combination / allocation right from onset in software development. Otherwise, it could lead to major bottleneck; one that can either halt project temporarily, extending delivery date or cause a total stoppage of the project. To forestall such occurrence, this work looked into tasks and resources combination for software projects by remodeling the risk identification procedure of an existing risk model (the Riskit model)- introduced by Kontio and Basili. A typical project process to depict the risk identification procedure of the original riskit model was arranged, the risk identification stage was then redesigned and broken down into different parts; the total RAG time (RAG T ) was generated in the process. The main task-resource fuzzification then took place in Part C and project information update took place in Part D of the system. Pointer setting to risky task was estimated from the difficulty level of the task-hours (X t ) and the required resources (X h ) and categorized using trapezoidal membership function in fuzzy logic. The resulting system later named ERAM was compared with Brainstorming and Riskit Analysis Graph technique (BRAG) using runtime and task segments performed by both models. Applying the task –resource combination, the result showed that the original model BRAG took 88 minutes while ERAM generated a faster advisory output in 3 seconds with X t X h coordinate W between 0.9 -1.0. Again, the study revealed that time saving is not the only benefit if resource (X h ) and Task (or task hour (X t )) combination and categorization is properly done. BRAG yearly runtime estimate was 237 hours at overhead cost of ₦3,051,612 while ERAM was 2.6 hours at overhead cost of ₦37,078. The study concluded that ERAM is adequate in speedy classification of software project constituents, early risk identification, reduction in delivery time and cost. Therefore, it is recommended that software risks experts and project managers adopt ERAM to increase speed in risk identification, gain ample time for other project activities and reduced overhead cost .


INTRODUCTION
In real world, many problems are usually too complicated to understand due to the ambiguity or uncertainties in the quantitative data embedded in them. For instance, to represent notions such as large profit, high temperature or a wealthy man may be a little difficult in real life [2]. Based on [9], both set theory and probability theory have served as a veritable tool in solving some of these reallife challenges. Although, hitherto concepts like probability theory has been applied to situation where system or data behavior are based on random processes. However, to solve complex problems yet to be solved by probability or set theory, fuzzy can be used. The works of [13] and that of [21], F : Fuzziness / fuzzification G: granularity / granulation Figure 1: fuzzy logic principal facets Source: [25] [25]established that fuzzy logic was initially developed by Lotfi A. Zadeh in 1965. Fuzzy represents a form of mathematical logic, with a theory which is based on the notion of relative graded membership, as inspired by the processes of human perception and cognition. The four principal facets of fuzzy logic are shown in Figure 1.In this work, fuzzy logic is applied in tasks and resources categorization in Software projects. For a better understanding, we present definition of operational terms and then looked at some basic concepts and earlier works in what follows.

A. Definition of operational terms
Task: tasks are simply activities carried-out during project development.
Project Resource: These are assets required to do (or carry out) the different tasks or activities of the project. Example are personnel, equipment, funding and time.
Quality: Quality has to do with "the software's behavior while it is executing, the structure and organization of the system programs and associated documentation" [22] Risk impact: This is the magnitude or degree of the potential loss to the organization or project team if a risk occur [14] Schedule: A plan or document that defines what must be done and the order of execution and for how long [24] BRAG: abbreviation of the Brainstorming and RAG process.

II. LITERATURE REVIEW Prediction and Classification Tools
This section provides analysis of some prediction and classifying tools used in computer science, software engineering and Information technology projects. Though these classifying tools are all strong and important; however from the list, a general and brief description of only the Bayesian, neural network and fuzzy shall be provided (since they are the most used) followed by analysis of related works to this work in which these algorithms have been used. Moreover, for the sole purpose of this work, more attention shall be given to fuzzy.

A. The Bayesian Probability Theory and Network Description
Bayesian probability (also known as Bayesian theorem) was proposed by Thomas Bayes in 18th century [11]; [12]). Deductions from [9] reveals that the Bayesian Network also called BN (or Causal probabilistic models, algorithms) was developed after the initial Bayesian theorem. With the BN and algorithms (plus the associated tools) being applied in a range of applications that were previously deemed impossible using the Bayesian theorem alone. The Bayesian probability involve decision making through accurate prediction of events. A formal definition of the model is presented below. Definition: "Let A1,A2,...,Ak be events that partition the sample space Ω, (that is .Ω =A1∪A2∪...∪Ak and Ai∩Aj =∅ ; when i ≠ j) and let B be an event on that space for which Pr (B) >0 ". Then Bayes' theorem is:

B. The Neural Network (NN)
Neural networks (otherwise called NN or Artificial Neural Network) can be pictured "in the means of a directed graph known as network graph"; the neural network basically comprises of a set of interconnected processing units [23]. Basically, the NN is "an exceptionally simplified model of the brain". In other words, these are computers or computer functions whose structural design (or design) are modeled after the brain. Just like the brain, typically, the NN tends to simulate "qualitative reasoning", each unit of node in the NN is a simplified model of a real neutron. These neutron fires (that is, send new signals) when it receives an input signal strong enough from other nodes to which it is connected [4]. Based on [15] and some of the initial works of [10] a neural net never needs more than two hidden layers to solve most problems.

C. The Fuzzy Logic Formal Definition
A fuzzy set is a "collection of objects that might belong to the set to a degree, varying from 1 for full belongingness to 0 for full non-belongingness, through all intermediate Figure 3 Structure of a Neuro-Fuzzy System Source: [7] values" [2] The main aim of fuzzy logic (FL) is to aid the formalization of modes of reasoning which are approximate rather than exact" [25]. In fuzzy, values between 0 and 1 represent uncertainty in decision-making. 0 being the false value while 1 stands for a true value. Thus in a fuzzy set, "a value x is not restricted by the values 0 or 1, but from the real interval [0;1].

D. Basic Elements of Fuzzy
There are three basic elements of fuzzy logic [2] and [25] These are: I Linguistic variables: These are variables whose values are sentences in a natural or artificial language; for example, wide, very wide, very very wide are values of DISTANCE, then DISTANCE is a Linguistic variable. Also we may have some words called modifier which is "an operation that modifies the meaning of terms". For instance, in the expression: "the water is extremely hot" the word extremely modifies hot (or the hot water) which is a fuzzy set.

II
Fuzzy rule (Fuzzy conditional statements) : The fuzzy rules are expressions of the form: IF A THEN B. The fuzzy rule as presented by the Latfi A. Zadeh who is the father of fuzzy logic in his 2004 work on "Fuzzy Logic Systems: Origin, Concepts, And Trends" could be any of these forms: Simple, Compound, Dynamic, Command and Dispositional. Again rule could also be Dependency and Command Example of Dependency B is large if A is small Command "Reduce Y slightly if X is small. Reduce Y substantially if X is not small" III. Fuzzy sets (or algorithms) This is an ordered sequence of instructions which may contain fuzzy assignment and conditional statements. As presented by [2], these basic elements are the main fuzzy systems' contrast with the conventional (CRISP) system depicted as figure 2.

E. Fuzzy Operators
For easy manipulation of fuzzy sets, it is acceptable to redefining the operators of the classical set theory so that it can fit the specific membership functions of fuzzy logic for values strictly between 0 and 1.
Normally, operators of fuzzy set are chosen just like the membership functions. This is depicted by [7] in the Table  1. Aside the usual properties like: "commutativity, distributivity and associativity" The two notable expressions associated with fuzzy operator are : a. "In fuzzy logic, the law of excluded middle is contradicted: A ∪ A 1 ≠ X, i.e. µ A∪ A 1 (x) ≠ 1. b. In fuzzy logic, an element can belong to A and not A at the same time: A∩A 1 ≠ ∅, i.e. µ A∩A 1 (x) ≠ 0. Note that these elements correspond to the set supp(A) − not(A)"

F. Reasoning in Fuzzy Logic
Fuzzy reasoning (also called approximate reasoning) works based on fuzzy rules, expressed using linguistic variables or natural language [7] The fuzzy rule is of the form: "If x ∈ A and y ∈ B then z ∈ C, with A, B and C fuzzy sets". For example: 'if (the intensity of the sun is high), then (weather is hot)'. The variable "weather belongs to the fuzzy set hot to a degree that depends on the degree of validity of the premise". To determine the degree of truth of the proposition fuzzy 'weather is hot', we have to define the fuzzy implication.
Just like the fuzzy operator, the fuzzy designer must choose amongst the range of available fuzzy implications already defined or set it manually. However, the most commonly used fuzzy implications are: Table II Fuzzy Implication   Name Truth Value John Although, there are other definitions of fuzzy implication but they are not as popular as the ones above. Based on Zadeh's theory and the work of [7], the result of the application of a fuzzy rule is subject to three factors: 1. The chosen definition of fuzzy implication 2. The definition of the membership function of the fuzzy set of the proposition located at the Conclusion of the fuzzy rule, 3. The degree of validity of propositions located premise Then based on the defined fuzzy operators i.e the AND, OR, and NOT, the premise of a fuzzy rule can be formed from the combination of fuzzy propositions. Hence, the combination of all the rules of a fuzzy system is known as decision matrix.

G. Membership functions (MF) and Defuzzification IMembership functions (MF)
To clearly categorize parameters, several methods called membership function can be used. These membership function include the Gaussian, the triangular, the trapezoidal, generalized bell, πshaped and S-shaped membership functions [3]. Though, different people may define or formulate their own membership functions for a fuzzy set due to diverse levels of familiarity, information and experience. For instance, there may be the L or r, Gaussian or gamma [19]. There are no strict rules to the use of the MF and other specialized MFs can be created for specific applications and any type of continuous probability distribution functions can be used as an MF provided that a set of parameters is given to specify the appropriate meanings of the MF. However, since the simplest membership functions are formed using straight line, the trapezoidal function is chosen to be used for the categorization of the parameters and necessary factors in the development process of the proposed design in the later part of this work

II Defuzzification
Defuzzification is the process of approximating the value of the dependent variable based on the resultant fuzzy set after implementing the fuzzy inference rule [19]. Ref [18] further states that three defuzzification methods exists. These methods are: I. Average method: The average method is used to calculate the average numerical value of the dependent variable in the output fuzzy set. II.
Average of maximum method: The average numerical value of the dependent variable with the maximum degree of truth in the output fuzzy set. III.
Centroid method: The weighted average numerical value of the dependent variable in the output fuzzy set. The weight is the degree of truth. Later in this work, the average method is applied to defuzzify the output of the membership function. [7]; [20] i. Fuzzy logic is extremely useful for many people involved in research and development such as medicine and Psychology ii.

H. Classic Applications of Fuzzy Logicas presented by
In transportation; organizations like Honda and Toyota has increased the throughput of their engines over the years using fuzzy logic concepts. iii.
Fuzzy has been applied in law, economics [25].
In real life contexts an event 'is a fuzzy rather than a sharply defined collection of points' [2]. Hence, using the concept of a fuzzy set, 'the notions of an event and its probability can be extended in a natural fashion to fuzzy events. For instance, consider a case like: it is a sunny day or my laptop is around 10KG in weight. Lastly, fuzzy system is a great one! But can be combine with other analysed algorithms such as the Neural networks to form Neuro-fuzzy shown in figure 3 and applied as a better tool in decision making or solving complex problems.

I Related Works
According to [8]Data Gathering and labeling is a task that can be costly in terms of time, human inputs and other resources. Hence, this author proposed a model which helps to combat classification tasks where training data is small or little. To reach their goal, matching probabilities were coupled with error enhancement to focus on ideas which are essential forthe immediate tasks. The proposed model adopts a learning method by focusing on the specific elements of the input relevant for the proposed task. Methods such a linear model, a single -layer MLP and NBOW etc were trained in order to have a good comparison with the developed model. Very excellent work, however, work's attention isn't on how exact resource can be allocatedto project or organization's task, it is more of a representation learning methodology used in performing classification in the lowdata area.
Ref [6] chronicled the varieties of resources available in IT and project management. About seven categories were identified namely, services, labor, equipment, materials, money, space and time. However, this work only explore vis-à-vis the usage of this resources without linking the labor resources, the task and the time.
[20] did a comprehensive work on "Human resources classification and categorization" where he made a very clear difference between human resources and other forms of resource such as natural and capital resources, living and non living (as categorized by biologist) as well as spatial and temporal resources. The author further stressed that of all the classification, human resources are the most valued one to an organization. This is one of the detailed work on human resources and it functions in project and organization. However, it is limited in the areof tasks assigned to the Human resource the duration of tasks.
[15]on "Modelling Complex Resource Requirements in Business Process Management Systems", identifies business process management as a first source of apprehension for senior executives. It goes on to explain some other concern such as who to carry out planned activities or provide a basic view of resource engagedg in business operations. The work stressed that most previous work were dedicated to demonstrating control flow dependencies amongst activities and in widely-used process UML such as BPMN and EPC, multifaceted resource connections amid activities and resources are overlooked. Work did good analysis of workflow resources as it presented a rich data model for workflow resources that could be supported in a BPMS and subsequently able to develop resources classification object role model. However, the various model developed was based on authors's personal experiences. It only viewed resources as Human and non-human alone; no further breakdown was presented. Again, no categorization was done for tasks/ activities nor any consideration given to empirical combination of the duo vis-à-vis other resources such as time and cost during business or project execution. [18]attempted to add to the existing debate on the depiction of resources as a major constituents in the business model idea. As emphasized by this author, allocation of resource is core to strategic management of a company because of the causal uncertainty of decisions and actions.Stressing further, the process of allocating resource is multi-levelled and distributed in the both project and organization, which subsequently results in decisions that affect directly on the long-term direction, the characteristics and affluence or opulence of an organization.In a diversify project environment, distributed decision-making on resource allocation is then an act of harmonizing amongst various targets; and comprises uninterrupted enhancement in resource application. Author was able to establish the importance of managing resources and their allocation in both business and project environment. Using an existing theoretical structures, a categorization of resources was presented butnot much was said about the allocation. Although, this research was limited by methods used for study and some other factors, generally, going by author's report and end result, it is a successful work. However the issue of combination of taskresources was still leftout. A majorgap and limitation identified from the works analyzed above is in the area of task-resource combination ans allocation. Hence this is explored and improved upon in our model presented in this work.

III. TASKS AND RESOURCES FUZZIFICATION
In the introductory part of this work, a task is defined as an activity carried-out during project development while resources are used to carryout tasks. For a good synchronization (or alignment) with the system being developed, personnel or people is considered a good choice of resource; since personnel or people seems to have direct links to task. Personnel (for example the project manager) needs time to complete a task or an activity using an equipment (e.g a computer). In this work, some categories of task (in relation to the resources needed) were identified and how they maycontribute to timely delivery of software project result. These tasks and the different conditions are: i. Tasks or activities that requires more than a resource to complete. If a task requires more than a resource to complete, it is likely to delay another task waiting to use the same resource or one of the resources, thereby creating a deadlock hence it will require a different algorithm to solve.
ii. Tasks or activities that may take too long to complete. This set of tasks are not necessarily difficult. They may be delayed due to waiting in line to be allocated resources or they may have too many processes to undergo before completion.
iii. Lastly, tasks or activities that are presumed difficult. In line with the aforementioned, for the sole purpose of the proposed system, four major categories of tasks have been labelled as shown in the Figure 4.

A Fuzzification Process Using Trapezoidal Membership Function
To adequately place or categorize our tasks and subtasks as either extremely difficult, difficult, normal, easy or very easy, we use the process called the fuzzy inference (see (2) The trapezoidal curve is a function of a vector x and depends on four scalar parameters which are: a, b, c, and d as given by: For each of the determinant variable for task easiness or difficulty (that is the resources such as human/personnel, time and cost) we defined: a) Human resource are rated as either: unskilled, skilled or moderate. b) Time or task duration we have: long, moderate and short. c) For this assignment, cost is assumed trivial because cost is determine by the resource category and time spent on the task. d) At the end of the computation, each output variable (denoted by Wi) has four variables namely: very difficult, difficult, easy and very easy.
We also need to know that regular trapezoid has two side slopes (as shown in figure 5).
1) The left slope which is given by: 2) The right slope which is given by : Next we define the fuzzy sets to be usedin Table IV   Table IV

B. Inference rules
Based on the assumptions about the various task stated above, the following basic rules can be generated. a) If [(skilled resource is available) and (the task is easy)] then risk is low b) If (the resource is unavailable) and task is easy, project is still at risk due to delay c) If (the resource is available), and task is complex or not easy, risk is high.

Output
The output is Wi which is a coordinate of task (or task hour Xt) and resource (Xh) signifying how difficult or easy tasks are. The W is computed as follows: Wi = min [FN (Xt), FN (Xh)]………………………(4) Where Wi = output function. From the fuzzy results, "three levels of equation is derived in relation to the risk ranking. i. Membership function for fuzzy set "High risk" is given by: Where x = the risk ranking ii Membership function for fuzzy set risk is "moderately high" is given by: and finally, iii Membership function for fuzzy set risk is "Very high" " is given by (7)

C. Defuzzification Method
As explained earlier, three methods are available for the defuzzification of the outputs. These methods are: the average method, the average of maximum method and the centroid methods. The suitability of these methods depends on the diverse situations or application area. In this work, the average method is employed for the defuzzification of the output of the membership function. This is simply because it is straight forward and fits well with the trapezoidal membership function used in the fuzzification process. 1. Automation of different tasks and activities involved in the whole development. 2. It helps to determine if a task requires more than a resource or (and more time) to complete.

Determination of how sensitive tasks and activities
and their proximity to danger.

E. The Knowledge Base (KB) of the ERAM
The proposed system named ERAM is planned to help predict risks in fresh projects. Although, some knowledge of past software project is used with the fresh input to make adequate decision on risks. This is kept in Figure 7

Outcome: Task-Resource fuzzification
Task-Resource categorization is reached through combination of Human Resource (Xh) and task (Xt) as shown in Table 7. Each case is computed vertically, with a combination of XtandXh . Since there are no strict rules to the use of the trapezoidal function, it is modified for use with appropriate meanings to the parameters specified.
The fuzzy concepts consider numbers between 0 and 1. The inputs are chosen from the x-axis of the trapezium (in the range of -1 and 1) for the purpose of achieving our objectives. Since we are using the trapezoidal membership function, we substitute these values into equation ( Table 8a-c computationally, at the subtask level of the ERAM, we have :

F. INTERPRETATION OF RESULTS: "W Outputs"
Outputs 0. 25 As defined earlier at the beginning of the fuzzy set analysis, the output variable have four ratings: very difficult, difficult, easy and very easy for the tasks. Hence, this results shows for every tasks with Wi= 0.25, the task is a very easy task. Probably requesting just one resource to complete. Output 0 This is the Extreme case of task easiness. This task has all the resource at its disposal. Probably no other task or activity in the development process is in need of the resources. Another possibility is lag setting by the project manager for such tasks and activities so that they are not in any way in a competitive mode. For output 0.5 Tasks with output 0.5 is still easy. May request more than a resource. But the resources are readily available. So task may not wait for too long to complete. For Outputs 0.75 This result shows that for all task with output Wi= 0.75 there are inclination to being considerably difficult and hence, it may take a little longer to complete. Outputs 0.9 -1.0 Outputs Wi= 1.0 is the negative extreme case of task operation. According to our earlier definition, this output simply means the task is very difficult and consequently, it may take longer to complete. And when this happens, it is either the resources needed to complete this task is being used for another task or the task operation has entered into an indefinite loop. It also implies that if the output of this task is needed as an input into another to complete an operation, then the project is at risk. It may not complete on time or halt indefinitely too.

IV.CONCLUSION
This study has revealed that time saving is not the only benefit if resource (Xh) and Task (or task hour (Xt)) combination and categorization is properly done. ERAM has established the discovery of risk items (or activities) emanating from task-hour and resources on time; improving delivery time (shown in Table X), promoting organization image and subsequent increase in work demand and profit.
Having considered the risk identification procedure of the generic model and the general overhead of the processes involved, the developed ERAM is definitely a supportive tool for risk identification and classification based on its "two level sieve"the conditional probability and the fuzzy logic as a "task-resource" classification tool which makes it difficult for any threat converting to risk without being captured.

V.ACKNOWLEDGMENT
We acknowledged the initial work of Kontio and Basili on Riskit model which provided a background for this study.