Estimation of reservoir rock porosity using linear ensemble combination of single artificial neural networks based on analytical and genetic algorithm techniques



Porosity is one of the most important properties for comprehensive studies of hydrocarbon reservoirs. For determination of porosity in a rock, that is the ratio of volume of voids to the total volume of the rock, there are two conventional methods: In the first method, direct measurement of porosity is carried out by testing drilling cores. In the second method, porosity is determined indirectly using well logging data and relevant mathematical relations or equations. There are some limitations and difficulties for determination of porosity using both the above methods. Using the artificial neural networks (ANNs) method for this purpose can reduce these difficulties remarkably, and also, contains acceptable results. Solving any problem using ANNs needs a three-step procedure: training, generalization and operation. In the training step, the network teaches the patterns that exist in the inputs and the relation between the inputs and the outputs of the training set. Generalization is the ability of the network to present acceptable responses for the inputs that have not been included in the training set (unseen patterns). Operation is the use of the network for the objective problem. Obviously the network, which is used in the operation step, must be well trained and have a suitable generalization performance. One of difficulties which may occur for a network after being trained, is overfitting that is the same as poor generalization performance. If conditions are so that the network is trained to a favorable amount of error reduction for training patterns or to a distinct number of epochs but overfitting does not occur, in this state the training is called overtraining. In the ANNs method, a number of networks are trained. These networks are evaluated using a suitable performance criterion, for example mean square errors (MSE), and based on this criterion, the best network is selected. Although selecting the best single neural network generates the best obtained pattern, it leads to loss of information existing in the other networks. There is the drawback that the generalization performance of the best selected network for unseen patterns is limited and more over, error in estimation is common. If we accept that for all possible test patterns, complete or 100 % generalization is impossible, we have a convincing reason to search for methods for improving the performance of ANNs. For this purpose, a combination of trained networks using suitable methods has been proposed because this work may lead to integrate the information of the networks of the components in the combination and thus to help the enhancement of the accuracy of the results and the generalization performance of the combination in comparison with the best selected network. Using a combination of single neural networks, multiple network systems which are also called committee machine (CM), are generated to access better results for problems that a network alone cannot solve or may be solved effectively using CM. Ensemble combination of ANNs is a type of CM having parallel structure in which any of its components or networks solely presents a solution for the objective problem, and then the solution results are combined in a proper manner. In function estimation problems, ensemble combinations can be made linearly or nonlinearly. In this research work, linear ensemble combination of single artificial neural networks was applied in order to estimate the effective porosity of the Kangan gas reservoir rock in the giant Southern Pars hydrocarbon field. From the view point of structural geology, the Kangan gas deposit is an asymmetric anticline with a northwest-southeast spread whose southeast side is turned. This geologic formation consists of dolomite, limestone, dolomitic lime and thin layers of shale. Well logging data acquired from 4 wells in the area at a depth interval corresponding to the Kangan formation were used. 215 selected patterns from wells SP1, SP3 and SP13 were used for training the networks and 89 selected patterns from well SP6 were used for testing the generalization performance of the networks. In each pattern, acoustic, density, gamma ray and neutron porosity well log data were considered as the inputs of the networks and the effective porosity data were assigned as the outputs of the networks. First, back propagation single neural networks having different structures (totally 90 structures) were trained using the overtraining method. Then, 7 networks which had the best results, i.e. containing minimum MSE in the test step, were selected for making ensemble combinations. 120 Linear ensemble combinations of these 7 networks (i.e. 21 two-fold combinations, 35 three-fold combinations, 35 four-fold combinations, 21 five-fold combinations, 7 six-fold combinations and 1 seven-fold combination) were constructed using analytical methods including simple averaging and four different Hashem’s optimal linear combination (OLC) methods, i.e. unconstrained MSE-OLC with a constant term, constrained MSE-OLC with a constant term, unconstrained MSE-OLC without a constant term and constrained MSE-OLC without a constant term. In Hashem's methods, coefficients of networks in MSE-OLC are computed by performing a set of matrix operations. Then the best produced combination using the above-mentioned 5 analytical methods was selected from each of two-fold, three-fold, four-fold, five-fold, six-fold and seven-fold combination sets (i.e. the combination which contained minimum MSE in the test step). For the 6 selected combinations, in addition to analytical methods, the coefficients of MSE-OLC were computed using genetic algorithm (GA). The best produced analytical ensemble combination, which with respect to the other analytical combinations had the maximum reduction in MSE of the test step compared to the best single neural network, was a three-fold unconstrained MSE-OLC without constant term. This combination in comparison with the best single neural network decreased the MSE in the training and test steps 6.3 % and 4.9 %, respectively. Despite this, the best ensemble combination among all the combinations was a six-fold OLC obtained using the GA optimization method. This best ensemble combination, compared to the best single neural network, reduced the MSE in the training and test steps 14.4% and 12.5%, respectively. Generally, in the all cases that were investigated, OLC using GA yielded better results as it caused more reduction in MSE of the test step compared to analytical combinations. However, OLCs using Hashem's methods compared to other combinations generally contained more reductions in MSE of the training step.