Config Gigablast search engine

Config Gigablast search engine - search-engine

I have a CentOS 6 server.
Now to install Gigablast search engine on it.
I follow this guide case for ReadHat. http://www.gigablast.com/faq.html#src
After run make -j 4. I run './gb 0' to start your a single gigablast node.
Then I access to my server via IP address: x.x.x.x:8000, but I get message "This webpage is not available"
Could someone tell me why I get this error.
I can't see the error in my log file.
And this is my log content.
db: Logging to file /dev/stderr.
db: Use 'gb -d' to run as daemon. Example: gb -d
1447962510128 000 conf: Gigablast Version: Nov 19 2015 19:37:46
1447962510128 000 conf: Gigablast Architecture: 64-bit
1447962510128 000 host: Working directory is /root/open-source-search-engine-master/
1447962510128 000 host: Using /root/open-source-search-engine-master/hosts.conf
1447962510128 000 host: Process ID is 9751
1447962510128 000 host: Detected local ip 127.0.0.1
1447962510128 000 host: Detected local ip x.x.x.x
1447962510128 000 host: Running as host id #0
1447962510374 000 wikt: Loading /root/open-source-search-engine-master/wiktionary-syns.dat
1447962510499 000 wikt: Loading /root/open-source-search-engine-master/wiktionary-buf.txt
1447962510552 000 wikt: test "love" -> "en|love,loved,loving,loves"
1447962510552 000 wiki: Loading /root/open-source-search-engine-master/wikititles2.dat
1447962510635 000 mem: addMem(100663296): tbl-wiki. ptr=0x7fcbfa33e014 used=127113172
1447962510727 000 db: Loading conf for collection main (0)
1447962510997 000 mem: addMem(349869920): buckets-posdb. ptr=0x7fcbe4b1d014 used=238787108
1447962511145 000 mem: addMem(193164080): mem-titledb. ptr=0x7fcbd8bce014 used=596421652
1447962511629 000 mem: addMem(100000022): mem-spiderdb. ptr=0x7fcbc4273014 used=1035055046
1447962511950 000 db: Verifying shard parity for posdb of 64000 bytes for coll main (collnum=0)...
1447962511950 000 db: Verifying shard parity for titledb of 640000 bytes for coll main (collnum=0)...
1447962511950 000 db: Verifying shard parity for tagdb of 64000 bytes for coll main (collnum=0)...
1447962511950 000 db: Verifying shard parity for clusterdb of 64000 bytes for coll main (collnum=0)...
1447962511950 000 db: Verifying shard parity for linkdb of 64000 bytes for coll main (collnum=0)...
1447962511950 000 db: Verifying shard parity for spiderdb of 64000 bytes for coll main (collnum=0)...
1447962511950 000 db: Verifying shard parity for doledb of 64000 bytes for coll main (collnum=0)...
1447962512158 000 mem: addMem(109051904): udictht. ptr=0x7fcbac23c014 used=1429646307
1447962512161 000 table: grewtable udictht from 2097152 to 8388608 slots in 78 ms (this=0x1be9ba0) (used=0)
1447962512530 000 gb: unifiedDict-buf.txt or unifiedDict-map.dat checksum is not approved for live service (1974148069587949864 != -14450509118443930)
1447962512530 000 speller: turning off spell checking for now
1447962512593 000 mem: addMem(109051904): tbl-lang. ptr=0x7fcba5a3b014 used=1511435235
1447962512594 000 lang: Successfully Loaded 0 Language Lists and 0 duplicate word hashes.
1447962512594 000 cat: Error opening structure file: /root/open-source-search-engine-master/catdb/gbdmoz.structure.dat
1447962512594 000 cat: Loading Categories From /root/open-source-search-engine-master/catdb/gbdmoz.structure.dat Failed.
1447962512594 000 cat: Loaded Categories From /root/open-source-search-engine-master/catdb/gbdmoz.structure.dat.
1447962512595 000 admin: Loading hashtable from /root/open-source-search-engine-master/catcountry.dat
1447962512614 000 autoban: read 0 entries
1447962512649 000 udp: Listening on UDP port 9000 with niceness=2 and fd=3.
1447962512649 000 db: Loading cache from /root/open-source-search-engine-master//dns.cache
1447962512653 000 udp: Listening on UDP port 5998 with niceness=1 and fd=4.
1447962512653 000 dns: Sending requests on client port 5998 using socket descriptor 4.
1447962512653 000 dns: Using nameserver 8.8.8.8:53.
1447962512653 000 dns: Using nameserver 8.8.4.4:53.
1447962512658 000 https: Reading SSL certificate from: /root/open-source-search-engine-master/gb.pem
1447962512659 000 http: Listening on TCP port 8000 with sd=5
1447962512659 000 https: Listening on TCP port 7000 with sd=6
1447962512670 000 build: Loading 8 bytes from /root/open-source-search-engine-master/addsinprogress.dat
1447962512686 000 db: gb is now ready
1447962512737 000 spider: made spidercoll=7925150 for cr=2501800
1447962512738 000 spider: hit spider queue rebuild timeout for main (0)
1447962512738 000 spider: rebuild complete for main. Added 0 recs to waiting tree, scanned 0 bytes of spiderdb.
1447962512778 000 gb: clock is now synced with host #0.
1447962519926 000 thread: Using 36708352 bytes for 20 thread stacks.

gigablast starts on server on port 8000 by default, try changing your port to 8000
http://x.x.x.x:8000

Related

GKE private cluster Django server timeout 502 after 64 API requests

So I have a GKE private cluster production ready environment, where I host my Django Rest Framework microservices.
It all works fine, but after 64 API requests, the server does a timeout and the pod is unreachable.
I am not sure why this is happening.
I use the following stack:
Django 3.2.13
GKE v1.22.8-hke.201
Postgresql Cloud SQL
Docker
My Django application is a simple one. No authentication on POST. Just a small json-body is sent, and the server saves it to the PostgreSQL database.
The server connects via cloud_sql_proxy to the database, but I also tried to use the IP and the PySQL library. It works, but same error/timeout.
The workloads that are having the issue are any workloads that do a DB call, it does not matter if I do a SELECT * or INSERT call.
However, when I do a load balancing test (locust python) and test the home page of any microserver within the cluster (readiness) I do not experience any API calls timeout and server restart.
Type Name # reqs # fails | Avg Min Max Med | req/s failures/s
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
POST /api/logserver/logs/ 64 0(0.00%) | 96 29 140 110 | 10.00 0.00
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
Aggregated 64 0(0.00%) | 96 29 140 110 | 10.00 0.00
Type Name # reqs # fails | Avg Min Max Med | req/s failures/s
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
POST /api/logserver/logs/ 77 13(19.48%) | 92 17 140 100 | 0.90 0.00
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
Aggregated 77 13(19.48%) | 92 17 140 100 | 0.90 0.00
So It looks like it has something to do with the way the DB is connected to the pods?
I use cloud_sql_proxy to connect to the DB. And this also results in a Timeout and restart of the pod.
I have tried updating gunicorn in the docker environment for Django to:
CMD gunicorn -b :8080 --log-level debug --timeout 90 --workers 6 --access-logfile '-' --error-logfile '-' --graceful-timeout 30 helloadapta.wsgi
And I have tried replacing gunicorn with uwsgi.
I also tried using plain python manage.py runserver 0.0.0.0:8080
They all serve the backend, and I can connect to it. But the issue on timeout persists.
This is the infrastructure:
Private GKE cluster which uses subnetwork in GCP.
Cloud NAT on network for outbound external static IP (needed to whitelist microservers in third party servers)
The Cluster has more than enough memory and cpu:
nodes: 3
total vCPUs: 24
total memory: 96GB
Each node has:
CPU allocatable: 7.91 CPU
Memory allocatable: 29.79 GB
The config in the yaml file states that the pod gets:
resources:
limits:
cpu: "1"
memory: "2Gi"
requests:
cpu: "1"
memory: "2Gi"
Only when I do a readiness call to the server, there is no Timeout.
So it really points to a direction that the Cloud SQL breaks after 64 API calls.
The Cloud SQL Database stack is:
1 sql instance
1 database within the instance
4 vCPUs
15 GB memory
max_connections 85000
The CPU utilisation never goes above 5%

Why DHT can't find resource when to download with a trackerless torrent?

Please do as i do in your vps and then maybe the issue reproduced,replace the variable $vps_ip with your real vps ip during the below steps.
wget https://saimei.ftp.acc.umu.se/debian-cd/current/amd64/iso-cd/debian-10.4.0-amd64-netinst.iso
transmission-create -o debian.torrent debian-10.4.0-amd64-netinst.iso
Create a trackerless torrent ,show info on it:
transmission-show debian.torrent
Name: debian-10.4.0-amd64-netinst.iso
File: debian.torrent
GENERAL
Name: debian-10.4.0-amd64-netinst.iso
Hash: a7fbe3ac2451fc6f29562ff034fe099c998d945e
Created by: Transmission/2.92 (14714)
Created on: Mon Jun 8 00:04:33 2020
Piece Count: 2688
Piece Size: 128.0 KiB
Total Size: 352.3 MB
Privacy: Public torrent
TRACKERS
FILES
debian-10.4.0-amd64-netinst.iso (352.3 MB)
Open the port which transmission running on your vps.
firewall-cmd --zone=public --add-port=51413/tcp --permanent
firewall-cmd --reload
Check it from your local pc.
sudo nmap $vps_ip -p51413
Host is up (0.24s latency).
PORT STATE SERVICE
51413/tcp open unknown
Nmap done: 1 IP address (1 host up) scanned in 1.74 seconds
Add the torrent and seed it with transmission's default username and password on your vps(with your own if you already change it):
transmission-remote -n "transmission:transmission" --add debian.torrent
localhost:9091/transmission/rpc/ responded: "success"
transmission-remote -n "transmission:transmission" --list
ID Done Have ETA Up Down Ratio Status Name
1 0% None Unknown 0.0 0.0 None Idle debian-10.4.0-amd64-netinst.iso
Sum: None 0.0 0.0
transmission-remote -n "transmission:transmission" -t 1 --start
localhost:9091/transmission/rpc/ responded: "success"
Get the debian.torrent from your vps into local pc.
scp root#$vps_ip:/root/debian.torrent /tmp
Now to try download it in your local pc.
aria2c --enable-dht=true /tmp/debian.torrent
06/08 09:28:04 [NOTICE] Downloading 1 item(s)
06/08 09:28:04 [NOTICE] IPv4 DHT: listening on UDP port 6921
06/08 09:28:04 [NOTICE] IPv4 BitTorrent: listening on TCP port 6956
06/08 09:28:04 [NOTICE] IPv6 BitTorrent: listening on TCP port 6956
*** Download Progress Summary as of Mon Jun 8 09:29:04 2020 ***
===============================================================================
[#a34431 0B/336MiB(0%) CN:0 SD:0 DL:0B]
FILE: /tmp/debian-10.4.0-amd64-netinst.iso
-------------------------------------------------------------------------------
I wait about one hour ,the download progress is always 0%.

If you're using DHT, you have to open a UDP port in your firewall and then, depending on what you're doing, you can specify that port to aria2c. From the docs:
DHT uses UDP. Since aria2 doesn't configure firewalls or routers for port forwarding, it's up to you to do it manually.
$ aria2c --enable-dht --dht-listen-port=6881 file.torrent
See this page for some more examples of using DHT with aria2c.

Connection refused for application hosted over docker swarm

I have a docker swarm running on 2 nodes created via Oracle VirtualBox installed on CentOS 7. I am able to deploy a stack running 6 containers equally distributed over the two machines.
However, I am unable to connect to the deployed application with ports exposed.
Here's the content of my Docker Compose File
version: "3"
services:
web:
image: <myusername>/friendlyhello:latest
deploy:
replicas: 6
resources:
limits:
cpus: "0.1"
memory: 50M
restart_policy:
condition: on-failure
ports:
- "80:80"
networks:
- webnet
networks:
webnet:
Here is the output of docker-machine ls:
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
myvm1 * virtualbox Running tcp://192.168.99.100:2376 v18.09.0
myvm2 - virtualbox Running tcp://192.168.99.101:2376 v18.09.0
Here is the error from my curl command
curl http://192.168.99.100/
curl: (7) Failed connect to 192.168.99.100:80; Connection refused
Even though my application seems to be running fine.
docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
i5nw2wcir9j3 getstartedlab_web replicated 6/6 harmanspall/friendlyhello:latest *:80->80/tcp
docker service ps getstartedlab_web
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
dn7mrfo1yvg4 getstartedlab_web.1 harmanspall/friendlyhello:latest myvm1 Running Running 28 minutes ago
jxkr1psbvmpc getstartedlab_web.2 harmanspall/friendlyhello:latest myvm2 Running Running 28 minutes ago
jttd4t6b9gz5 getstartedlab_web.3 harmanspall/friendlyhello:latest myvm1 Running Running 28 minutes ago
zhs0c7ygj8cs getstartedlab_web.4 harmanspall/friendlyhello:latest myvm2 Running Running 28 minutes ago
mx6gykk3qocd getstartedlab_web.5 harmanspall/friendlyhello:latest myvm1 Running Running 28 minutes ago
pku7f60ij0bq getstartedlab_web.6 harmanspall/friendlyhello:latest myvm2 Running Running 28 minutes ago
My Docker network list, as seen from Swarm Manager:
NETWORK ID NAME DRIVER SCOPE
5c502a957a70 bridge bridge local
a3b1f749c09f docker_gwbridge bridge local
80nens8mmp6i getstartedlab_webnet overlay swarm
c9647a0f6c30 host host local
mj60zgzhiwjf ingress overlay swarm
5adba823ce78 none null local
Any pointers would be appreciated.
~~ EDIT ~~
This does not seem to be an issue with connectivity to VMs since I am able to ping my Virtualbox
ping 192.168.99.100 -c 5
PING 192.168.99.100 (192.168.99.100) 56(84) bytes of data.
64 bytes from 192.168.99.100: icmp_seq=1 ttl=64 time=0.246 ms
64 bytes from 192.168.99.100: icmp_seq=2 ttl=64 time=0.215 ms
64 bytes from 192.168.99.100: icmp_seq=3 ttl=64 time=0.226 ms
64 bytes from 192.168.99.100: icmp_seq=4 ttl=64 time=0.251 ms
64 bytes from 192.168.99.100: icmp_seq=5 ttl=64 time=0.262 ms
--- 192.168.99.100 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 3999ms
rtt min/avg/max/mdev = 0.215/0.240/0.262/0.017 ms
It also fails when I try curl from inside the VM
docker-machine ssh myvm1 "curl http://192.168.99.100/"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to 192.168.99.100 port 80: Connection refused
exit status 7
docker-machine ssh myvm1 "curl http://localhost/"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to localhost port 80: Connection refused
exit status 7

docker-compose replica hostname

I am trying to add a cluster with replicas using docker-compose scale graylog-es-slave=2 but for a version 3 Dockerfile unlike Docker compose and hostname
What I am trying to do ix figure out how to get the specific node in the replica set
Here is what I have tried
D:\p\liberty-docker>docker exec 706814bf33b2 ping graylog-es-slave -c 2
PING graylog-es-slave (172.19.0.4): 56 data bytes
64 bytes from 172.19.0.4: icmp_seq=0 ttl=64 time=0.067 ms
64 bytes from 172.19.0.4: icmp_seq=1 ttl=64 time=0.104 ms
--- graylog-es-slave ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.067/0.085/0.104/0.000 ms
D:\p\liberty-docker>docker exec 706814bf33b2 ping graylog-es-slave.1 -c 2
ping: unknown host
D:\p\liberty-docker>docker exec 706814bf33b2 ping graylog-es-slave_1 -c 2
ping: unknown host
The docker-compose.yml
version: 3
service:
graylog-es-slave:
image: elasticsearch:2
command: "elasticsearch -Des.cluster.name='graylog'"
environment:
ES_HEAP_SIZE: 2g
deploy:
replicas: 2 <-- this is ignored on docker-compose just putting it here for completeness

Instead of ., use _ (underscore), and add the prefix of the project name (the directory that holds your docker-compose.yml, I assume that it is liberty-docker_graylog):
ping liberty-docker_graylog-es-slave_1
You can see that doing network ls, search for the right network, then docker network inspect network_id.

Why such an overhead for system with docker containers usage?

I have a following question. I was designing recently a java application on Spring, that works with a database. And I have decided to perform a stress testing. Both the application and the database reside on a virtual Debian machine. I tested it with gatling, and here is what I got:
request count 600 (OK=600 KO=0 )
min response time 12 (OK=12 KO=- )
max response time 159 (OK=159 KO=- )
mean response time 21 (OK=21 KO=- )
std deviation 13 (OK=13 KO=- )
response time 50th percentile 17 (OK=17 KO=- )
response time 75th percentile 22 (OK=22 KO=- )
mean requests/sec 10.01 (OK=10.01 KO=- )
t < 800 ms 600 (100%)
800 ms < t < 5000 ms 0 ( 0%)
t > 5000 ms 0 ( 0%)
failed 0 ( 0%)
So far, so good. Ater that, I decided to put the database and jar into two containers. Here is a docker-compose.yml sample for that:
prototype-db:
build: prototype-db
volumes:
- ./prototype-db/data:/var/lib/mysql:rw
- ./prototype-db/scripts:/docker-entrypoint-initdb.d:ro
ports:
- "3306"
prototype:
image: openjdk:8
command: bash -c "cd /deploy && java -jar application.jar"
volumes:
- ./application/target:/deploy
depends_on:
- prototype-db
ports:
- "8080:8080"
dns:
- 172.16.10.1
- 172.16.10.2
The Dockerfile looks like this:
FROM mysql:5.7.15
ENV MYSQL_DATABASE=document \
MYSQL_ROOT_PASSWORD=root \
MYSQL_USER=testuser \
MYSQL_PASSWORD=12345
EXPOSE 3306
Now, after testing that with gatling I've got the followin results:
---- Global Information --------------------------------------------------------
request count 6000 (OK=3946 KO=2054 )
min response time 0 (OK=124 KO=0 )
max response time 18336 (OK=18336 KO=77 )
mean response time 5021 (OK=7630 KO=10 )
std deviation 4136 (OK=2478 KO=9 )
response time 50th percentile 6516 (OK=8694 KO=9 )
response time 75th percentile 8732 (OK=8905 KO=14 )
mean requests/sec 87.433 (OK=57.502 KO=29.931)
---- Response Time Distribution ------------------------------------------------
t < 800 ms 65 ( 1%)
800 ms < t < 5000 ms 532 ( 9%)
t > 5000 ms 3349 ( 56%)
failed 2054 ( 34%)
---- Errors --------------------------------------------------------------------
java.io.IOException: Remotely closed 1494 (72.74%)
status.find.is(200), but actually found 500 560 (27.26%)
This is amazing - the mean response time exceeded drastically, and a lot of errors, but this docker compose system runs on the very same virtual debian machine. What could cause exactly such an overhead, I thought that docker containers are a lot like native processed, they should not be running that slow.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Config Gigablast search engine - search-engine

gigablast starts on server on port 8000 by default, try changing your port to 8000 http://x.x.x.x:8000

Related

GKE private cluster Django server timeout 502 after 64 API requests

Why DHT can't find resource when to download with a trackerless torrent?

Connection refused for application hosted over docker swarm

docker-compose replica hostname

Why such an overhead for system with docker containers usage?

Categories

Resources